I would like to see a 32nm Quad Core Westmere for Socket 1156 while retaining the on-die memory controller and more capable PCIe controller. I use a discrete graphics card and do not mind the P55 solution. These Quad Cores would be mainstream products but consume less power at the same clock, or operate at greater clock for the same power, and/or be more overclockable.
A socket 1156 Westmere Quad Core priced between $200 to $300 is useful to this mainstream user while a socket 1366 Hex Core or socket 1156 Dual Core is not useful to me.
It takes several transistors to make logic gates. It takes several logic gates to make a pipeline stage. Each pipeline stage must be in sync with the core clock. The core clock is buffered and propagated across the core. If the longest path in a pipeline stage takes 15 transistors then you need transistors to be at least 45GHz for a 3GHz clock target (15x3=45). Some stages such as L2 cache access or complex division take several cycles to complete but are still in sync.
Gulftown (Non-Extreme Edition) I7-970 6-core...should be released by 3Qtr 2010 for $564 bulk. That's 6 cores, clocked at around 3.4GHz at 130 Watts, by fall 2010.
As opposed to i7-980Xe "Extreme" Gultown at $999...
Wasn't it said somewhere that the 6-core extreme would be 2.66GHz? I mean unless they really pulled off something spectacular with westmere it's hard to see 6 cores fitting in a 130W TDP over 3GHz.
I think we'll have to wait for 32nm quad core desktop CPUs (Lynnfield successor) until Sandy Bridge in Q1/2011. Everything else doesn't make much sense because Intel just has or just will release updates in the other segments (32nm mobile CPUs just released, 32nm high-end desktop and server coming in march). What else could be the first Sandy Bridge chip?
Even factoring in PSU conversion losses, the high idle power draw of your test systems is worrying. Is that due to component choices or turning off power saving features? If the former, it seems rather pointless to be comparing CPU power efficiency when the rest of the system is drawing inordinate amounts of power, and if the latter, why?
They aren't measuring the same thing, though. the SPC review is measuring the load on the ATX 12V connector, which is the 4-pin connector just used to provide (or supplement) cpu power. The Anandtech measurements are measured at the wall, and include the PSU, motherboard, and other system devices.
SPCR is measuring total system power (DC output) as well as just the CPU+VRM (the ATX12V figure). They report both figures. I fail to see where the confusion lies?
1. its not the same system. might have discrete power hungry gpu and other things.
2. wall power measures everything in the tower, including psu losses, hhds, fans, optical drives etc.
1. In which case it's silly to discuss (and compare) power consumption for a supposedly power efficient CPU with peripherals that take up 6-8 times as much power as the CPU, non?
2. SPCR also includes everything in its system power, except for PSU losses. And I refuse to believe that a modern CPU could be only 10% efficient at 100W.
It still does not explain why AT's test systems are drawing so much power at idle. Most modern GPUs are pretty efficient at idle these days, if it's an issue with GPU or other component choice, then the testbed needs to be updated, otherwise it is useless as a metric for CPU power efficiency. Likewise with software/drivers/windows settings.
and no, it's certainly not "silly" to discuss power consumption of an entire system as it's a real world test, the i5-750 and i7-860 required a discrete card. SPCR reviewed chips with on-chip graphics, so did not need a discrete card. The i5/i7 comparison is to other chips using as close to the same systems as possible, the differences being unavoidable, i.e. you can't run an i7-920 with 2x2 GB DDR3 and the various chipsets are ALWAYS going to be running along with the processor.
Actually, all the spcr tests bar one were done with a 9400GT and a Raptor HDD (actual model unspecified).
Looking at GTX285 idle power consumption across various websites, it actually seems to idle quite well, at around 30-40W - the 9400 GT idled around 10W (est) at SPCR.
That still doesn't explain the discrepancy between AT's idle figures and SPCR's. Even factoring in the GTX280/285, and including PSU losses, there's about 40-50W discrepancy in the X3 720 results, which amounts to 90-100% difference from SPCR's figures. I doubt minor component changes (RAM, HDDs) would make up so much difference, especially at idle, when their power draw is lowest.
It may be a matter of software/bios/acpi configuration settings, or it could just be that the testing methodology of the two sites is not directly comparable. But I am curious which, and why.
I am wondering should i wait for Advanced Vector Extensions (AVX)?
Is it going to double the speed of video encoding?
If so then it is going to make a huge difference.
I am waiting for the intel developer forum. Probably there will be more light on the subject.
I've been passively seekign a new laptop for a while now, but I'm not sure Arrandale give me sufficient motivation to upgrade. What can you tell me about the next generation mobile chipset?
AMD is about a year behind Intel with respect to process technology, and Intel just introduced 32nm processors this January.
Judging from that, it is very likely that AMD will introduce 32nm processors in Q1/2011, and since quad core processors seem to be their bread and butter now (look at the $99 "620", for instance), it is not unreasonable to speculate that AMD could beat Intel to a 32nm quad core - but if so, only by a few weeks, of course.
Aside from the server market, there are some folks that use Photoshop quite heavily; and with PS, it can use as many cores, as many threads, and as much memory as you can throw at it, and some features can still take a very long time to run. I anxiously await the day that I can have 6 OC'd cores running instead of my current 4.
... which is of course bullshit. All that matters is how threaded is the application you run. If you have one-threaded application running on OS X and one-threaded application running on OS Y, they will both run only on one thread.
As well as there is no use to force multi threading at all cost. Yes, you can make your average calculator using 12 threads, but why? Totaly useless when one thread takes like 0.0000000001s cpu time. The more threads the more cpu cost to manage the threads, their synchronization, accesses and all. Going threaded is a good thing if there is a gain in doing so. Going threaded at all cost is just ego-enlargement.
It looks to be a dual triple core on one die. This would mean the 12MB L3 isn't unified but is actually 2 x 6MB L3.
Can you confirm if this is true or not?
Intel still has to obey the laws of physics. If core 5 wants L3 cache information that's stored in the cache right below core 0, how much you want to bet it will get it later than if the information is stored right below core 5?
First, from those pictures it looks like 1366 won't be getting on-board PCIe controllers. I'm guessing that having the X58 with it's own PCIe controller sitting between the CPU and the slots is the reason, but I was hoping to see this make it to the new chips.
On another note, I'm really hoping for a 32nm quad core on 1156, and soon. I'm eyeballing an i7 Mitx build, and a die shrink would help keep thermals down.
(Sorry Tetrong, I accidentally reported you, MODS please do not ban or delete the post!!)
Tetrong, let's see what really happens at the simplest level using simplest math. At 10GHz frequency, light in vacuum can travel 0.3 centimetres or 30mm per clock cycle. Now if we assume the core is perfectly square, we can use pythagorean's theorem to calculate the maximum die size.
You can have a maximum die size of 21mm on each side or 441mm2 for 10GHz frequency, if EVERYTHING is perfect. That won't be true, even if the circuitry doesn't have any faults reaching such frequency. True, the core execution won't be anywhere near 400mm2, so 10GHz is possible. But as of now, we won't see it in the near future.
But electrons do NOT travel at the speed of light. Going by your theory, that would make the max die size just a fraction of 441 sq. mm.
Actually it may be even smaller, because I'm quite sure electrons don't take a linear path inside the processor. But the main reasons why still don't see a 10GHz chip are completely unrelated to this.
Indeed! The electrons do not travel at the speed of light and they never have to do it. This is the electromagnetic field which forces them to move and the propagation speed of the field is the speed of light (well, almost, in the dielectri?). So the calculation is valid.
It's true that electrons don't travel at the speed of light. They travel at a few inches per hour through any form of conductor. Electrons are "bumped" from the rear by an entering electron. You need to think of the inside of a transistor, wire, or other conductor as a huge traffic jam. Electrons are lined up and when hit from behind by a new electron, the whole thing just moves forward slowly until the one at the front takes all the force and is shot out.
A person can calculate using a manual stopwatch how long it takes for an actual electron to make its way through a circuit. You could leave the system on all day and the first electron that entered your PC would find its way to ground by the end of the day. That's how slow electrons are.
What needs to be considered is how easy and quickly the force generated by the entering electron reaches the lead electron, not the electron movement itself.
You guys have no idea what you are talking about!!! Stop with the uninformed techno babble all ready! There is not one modern processor that needs to get a signal from one end of the chip to the other in ONE clock cycle!!!!!!!!!!!! Just stop.
Quite true! I believe the reason clocks are stuck around 3GHz is more to do with thermal/power usage considerations than feasibility. And if CPUs are doing more "per clock" then the clock rate isn't a particularly good metric of speed either. Can't help thinking at some point they'll run out of instruction-level parallelism and have to start upping the clocks on the executing parts again though.
That's why I said "going by your theory..." and I also said the reasons are completely unrelated.
I was just commenting on the calculations, not the theory.
Intel to date has not done that with Core Counts. Defects in Cache have led to other processor derivatives. But no 3-core or 2-core Nehalem's by Intel to date, and I doubt you will see them here.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
45 Comments
Back to Article
ClagMaster - Tuesday, March 23, 2010 - link
I would like to see a 32nm Quad Core Westmere for Socket 1156 while retaining the on-die memory controller and more capable PCIe controller. I use a discrete graphics card and do not mind the P55 solution. These Quad Cores would be mainstream products but consume less power at the same clock, or operate at greater clock for the same power, and/or be more overclockable.A socket 1156 Westmere Quad Core priced between $200 to $300 is useful to this mainstream user while a socket 1366 Hex Core or socket 1156 Dual Core is not useful to me.
glenster - Saturday, February 6, 2010 - link
World's Fastest Graphene Transistor:100 billion cycles/second (100 GigaHertz)
http://www.sciencedaily.com/releases/2010/02/10020...">http://www.sciencedaily.com/releases/2010/02/10020...
tygrus - Tuesday, February 9, 2010 - link
It takes several transistors to make logic gates. It takes several logic gates to make a pipeline stage. Each pipeline stage must be in sync with the core clock. The core clock is buffered and propagated across the core. If the longest path in a pipeline stage takes 15 transistors then you need transistors to be at least 45GHz for a 3GHz clock target (15x3=45). Some stages such as L2 cache access or complex division take several cycles to complete but are still in sync.TheGreek - Thursday, February 4, 2010 - link
But folders tend to run their machine 100% loaded 24/7/365, so the power savings isn't much of an issue.But the sooner Intel lets us know that the chipset is also 32nm the better.
georgekn3mp - Thursday, February 4, 2010 - link
Gulftown (Non-Extreme Edition) I7-970 6-core...should be released by 3Qtr 2010 for $564 bulk. That's 6 cores, clocked at around 3.4GHz at 130 Watts, by fall 2010.As opposed to i7-980Xe "Extreme" Gultown at $999...
Robear - Friday, February 5, 2010 - link
Wasn't it said somewhere that the 6-core extreme would be 2.66GHz? I mean unless they really pulled off something spectacular with westmere it's hard to see 6 cores fitting in a 130W TDP over 3GHz.Mike1111 - Thursday, February 4, 2010 - link
I think we'll have to wait for 32nm quad core desktop CPUs (Lynnfield successor) until Sandy Bridge in Q1/2011. Everything else doesn't make much sense because Intel just has or just will release updates in the other segments (32nm mobile CPUs just released, 32nm high-end desktop and server coming in march). What else could be the first Sandy Bridge chip?Voldenuit - Thursday, February 4, 2010 - link
Whoa. Most of your systems are idling at/above 100W? @_@ Something is seriously rotten in the state of Denmark here.Even the Athlon II X4 955 BE system at SPCR drew only 56W (DC)
http://www.silentpcreview.com/article1013-page5.ht...">http://www.silentpcreview.com/article1013-page5.ht...
And their intel Core i5 661 system sips a miserly 18W (!) at idle.
http://www.silentpcreview.com/article1019-page5.ht...">http://www.silentpcreview.com/article1019-page5.ht...
Even factoring in PSU conversion losses, the high idle power draw of your test systems is worrying. Is that due to component choices or turning off power saving features? If the former, it seems rather pointless to be comparing CPU power efficiency when the rest of the system is drawing inordinate amounts of power, and if the latter, why?
kmmatney - Thursday, February 4, 2010 - link
They aren't measuring the same thing, though. the SPC review is measuring the load on the ATX 12V connector, which is the 4-pin connector just used to provide (or supplement) cpu power. The Anandtech measurements are measured at the wall, and include the PSU, motherboard, and other system devices.Voldenuit - Thursday, February 4, 2010 - link
SPCR is measuring total system power (DC output) as well as just the CPU+VRM (the ATX12V figure). They report both figures. I fail to see where the confusion lies?xaris106 - Thursday, February 4, 2010 - link
1. its not the same system. might have discrete power hungry gpu and other things.2. wall power measures everything in the tower, including psu losses, hhds, fans, optical drives etc.
Voldenuit - Thursday, February 4, 2010 - link
1. In which case it's silly to discuss (and compare) power consumption for a supposedly power efficient CPU with peripherals that take up 6-8 times as much power as the CPU, non?2. SPCR also includes everything in its system power, except for PSU losses. And I refuse to believe that a modern CPU could be only 10% efficient at 100W.
It still does not explain why AT's test systems are drawing so much power at idle. Most modern GPUs are pretty efficient at idle these days, if it's an issue with GPU or other component choice, then the testbed needs to be updated, otherwise it is useless as a metric for CPU power efficiency. Likewise with software/drivers/windows settings.
ereavis - Thursday, February 4, 2010 - link
see AT i7-870 review.SPCR setup: motherboard, two sticks of memory, a notebook hard drive, idle Blu-ray drive, keyboard and mouse
AT setup: motherboard(same), four sticks (more), SSD (less), no optical (less), GTX 280 (probably all the difference)
so like the previous person told you, --DISCRETE VIDEO CARD-- vs SPCR's i5 IGP.
ereavis - Thursday, February 4, 2010 - link
and no, it's certainly not "silly" to discuss power consumption of an entire system as it's a real world test, the i5-750 and i7-860 required a discrete card. SPCR reviewed chips with on-chip graphics, so did not need a discrete card. The i5/i7 comparison is to other chips using as close to the same systems as possible, the differences being unavoidable, i.e. you can't run an i7-920 with 2x2 GB DDR3 and the various chipsets are ALWAYS going to be running along with the processor.Voldenuit - Thursday, February 4, 2010 - link
Actually, all the spcr tests bar one were done with a 9400GT and a Raptor HDD (actual model unspecified).Looking at GTX285 idle power consumption across various websites, it actually seems to idle quite well, at around 30-40W - the 9400 GT idled around 10W (est) at SPCR.
That still doesn't explain the discrepancy between AT's idle figures and SPCR's. Even factoring in the GTX280/285, and including PSU losses, there's about 40-50W discrepancy in the X3 720 results, which amounts to 90-100% difference from SPCR's figures. I doubt minor component changes (RAM, HDDs) would make up so much difference, especially at idle, when their power draw is lowest.
It may be a matter of software/bios/acpi configuration settings, or it could just be that the testing methodology of the two sites is not directly comparable. But I am curious which, and why.
DigitalFreak - Thursday, February 4, 2010 - link
ZZZZzzzzzzz.....vlado08 - Wednesday, February 3, 2010 - link
I am wondering should i wait for Advanced Vector Extensions (AVX)?Is it going to double the speed of video encoding?
If so then it is going to make a huge difference.
I am waiting for the intel developer forum. Probably there will be more light on the subject.
IntelUser2000 - Wednesday, February 3, 2010 - link
It will only matter when the programs are optimized. Usually media extensions like these are overrated, as you don't get immediate benefits.Drag0nFire - Wednesday, February 3, 2010 - link
I've been passively seekign a new laptop for a while now, but I'm not sure Arrandale give me sufficient motivation to upgrade. What can you tell me about the next generation mobile chipset?Doormat - Wednesday, February 3, 2010 - link
I guess my buying decision depends on how quick we might see a 4 core "cheap" ($300) 32nm-based CPU. Q2? I'll wait. Q4? Too long.Inspector2211 - Wednesday, February 3, 2010 - link
Hmmm.AMD is about a year behind Intel with respect to process technology, and Intel just introduced 32nm processors this January.
Judging from that, it is very likely that AMD will introduce 32nm processors in Q1/2011, and since quad core processors seem to be their bread and butter now (look at the $99 "620", for instance), it is not unreasonable to speculate that AMD could beat Intel to a 32nm quad core - but if so, only by a few weeks, of course.
FlyTexas - Wednesday, February 3, 2010 - link
Yes, I know there are server and other uses for all those cores, but it would be nice if more software used all the cores.I have a Core i7 920, and I find that I rarely use more than 2 cores for anything.
dilidolo - Wednesday, February 3, 2010 - link
Depending on what OS you use.jakejones - Wednesday, February 3, 2010 - link
Aside from the server market, there are some folks that use Photoshop quite heavily; and with PS, it can use as many cores, as many threads, and as much memory as you can throw at it, and some features can still take a very long time to run. I anxiously await the day that I can have 6 OC'd cores running instead of my current 4.Holly - Wednesday, February 3, 2010 - link
... which is of course bullshit. All that matters is how threaded is the application you run. If you have one-threaded application running on OS X and one-threaded application running on OS Y, they will both run only on one thread.As well as there is no use to force multi threading at all cost. Yes, you can make your average calculator using 12 threads, but why? Totaly useless when one thread takes like 0.0000000001s cpu time. The more threads the more cpu cost to manage the threads, their synchronization, accesses and all. Going threaded is a good thing if there is a gain in doing so. Going threaded at all cost is just ego-enlargement.
milli - Wednesday, February 3, 2010 - link
It looks to be a dual triple core on one die. This would mean the 12MB L3 isn't unified but is actually 2 x 6MB L3.Can you confirm if this is true or not?
IntelUser2000 - Wednesday, February 3, 2010 - link
It may look like it but its probably not. It was probably done because then they can just put the routing interface between the two 3 cores.If what you said was true, Nehalem would have been 2x4MB L3: http://www.devicedaily.com/wp-content/uploads/2008...">http://www.devicedaily.com/wp-content/uploads/2008...
No, in Nehalem, they can REALLY add cores and caches much easier than they did before.
redpriest_ - Wednesday, February 3, 2010 - link
Intel still has to obey the laws of physics. If core 5 wants L3 cache information that's stored in the cache right below core 0, how much you want to bet it will get it later than if the information is stored right below core 5?IntelUser2000 - Wednesday, February 3, 2010 - link
Same with Nehalem. Look at the pic I have shown. It's the reason why L3 cache runs slower nowadays. To fit with multi-cores better.Mr Perfect - Wednesday, February 3, 2010 - link
First, from those pictures it looks like 1366 won't be getting on-board PCIe controllers. I'm guessing that having the X58 with it's own PCIe controller sitting between the CPU and the slots is the reason, but I was hoping to see this make it to the new chips.On another note, I'm really hoping for a 32nm quad core on 1156, and soon. I'm eyeballing an i7 Mitx build, and a die shrink would help keep thermals down.
IntelUser2000 - Wednesday, February 3, 2010 - link
I think you need to check again which chips do power gating for L3 Anandtech.On the presentation titled: The Platform Evolves: Understanding the Intel Next Generation Microarchitectures(Nehalem and Westmere)
Filename: ACHS002
Page 10 says:
"Extended in 2009 platforms as Integrated Power Gates also used in shared cache and I/O logic to dynamically power down when inactive".
2009 doesn't seem like Westmere does it?
TETRONG - Wednesday, February 3, 2010 - link
That's great. Just get to 10 Ghz already:|IntelUser2000 - Wednesday, February 3, 2010 - link
(Sorry Tetrong, I accidentally reported you, MODS please do not ban or delete the post!!)Tetrong, let's see what really happens at the simplest level using simplest math. At 10GHz frequency, light in vacuum can travel 0.3 centimetres or 30mm per clock cycle. Now if we assume the core is perfectly square, we can use pythagorean's theorem to calculate the maximum die size.
You can have a maximum die size of 21mm on each side or 441mm2 for 10GHz frequency, if EVERYTHING is perfect. That won't be true, even if the circuitry doesn't have any faults reaching such frequency. True, the core execution won't be anywhere near 400mm2, so 10GHz is possible. But as of now, we won't see it in the near future.
ssj4Gogeta - Thursday, February 4, 2010 - link
Your calculations are right, it's 3cm or 30mm.But electrons do NOT travel at the speed of light. Going by your theory, that would make the max die size just a fraction of 441 sq. mm.
Actually it may be even smaller, because I'm quite sure electrons don't take a linear path inside the processor. But the main reasons why still don't see a 10GHz chip are completely unrelated to this.
Vutshi - Thursday, February 4, 2010 - link
Indeed! The electrons do not travel at the speed of light and they never have to do it. This is the electromagnetic field which forces them to move and the propagation speed of the field is the speed of light (well, almost, in the dielectri?). So the calculation is valid.LaughingTarget - Wednesday, February 10, 2010 - link
It's true that electrons don't travel at the speed of light. They travel at a few inches per hour through any form of conductor. Electrons are "bumped" from the rear by an entering electron. You need to think of the inside of a transistor, wire, or other conductor as a huge traffic jam. Electrons are lined up and when hit from behind by a new electron, the whole thing just moves forward slowly until the one at the front takes all the force and is shot out.A person can calculate using a manual stopwatch how long it takes for an actual electron to make its way through a circuit. You could leave the system on all day and the first electron that entered your PC would find its way to ground by the end of the day. That's how slow electrons are.
What needs to be considered is how easy and quickly the force generated by the entering electron reaches the lead electron, not the electron movement itself.
nonoski - Thursday, February 4, 2010 - link
You guys have no idea what you are talking about!!! Stop with the uninformed techno babble all ready! There is not one modern processor that needs to get a signal from one end of the chip to the other in ONE clock cycle!!!!!!!!!!!! Just stop.stephenbrooks - Monday, March 15, 2010 - link
Quite true! I believe the reason clocks are stuck around 3GHz is more to do with thermal/power usage considerations than feasibility. And if CPUs are doing more "per clock" then the clock rate isn't a particularly good metric of speed either. Can't help thinking at some point they'll run out of instruction-level parallelism and have to start upping the clocks on the executing parts again though.ssj4Gogeta - Friday, February 5, 2010 - link
That's why I said "going by your theory..." and I also said the reasons are completely unrelated.I was just commenting on the calculations, not the theory.
IntelUser2000 - Thursday, February 4, 2010 - link
Damnit, so I did have the original calculation right lol. Somehow I was really confused there.Well I do realize that. Only in vacuum can a light reach that speed, and electrons in a wire are even slower.
You don't need to limit the max die size to 441 sq mm. You can just have different clocks for each parts of the processor or have its own clocks.
mikepers - Wednesday, February 3, 2010 - link
Maybe just a typo and the rest of your comment is correct but fwiw .3 centimeters is 3mm not 30mm.IntelUser2000 - Wednesday, February 3, 2010 - link
Damnit! Yea, you are right, thanks for pointing it out.At this rate we'll see 10GHz by 2025.
puffpio - Wednesday, February 3, 2010 - link
I would assume it would be a given, since not all 6-core Westmere's will come out of the fab perfectly..shut off 2 cores and sell it as a 4-coreBSMonitor - Thursday, February 4, 2010 - link
Intel to date has not done that with Core Counts. Defects in Cache have led to other processor derivatives. But no 3-core or 2-core Nehalem's by Intel to date, and I doubt you will see them here.Roy2001 - Wednesday, February 3, 2010 - link
Hope they would release non-extreme version soon.