What is hyper threading in intel processors. Hyper-Threading technology - what is it? How to enable and use? Software and hardware factors

What is hyper threading in intel processors. Hyper-Threading technology - what is it? How to enable and use? Software and hardware factors

Today I decided to cover the topic advisability of purchasing processors with Hyper-Threading(Hyper Trading) for games.

The first thing to note is that there cannot be a clear answer to the question posed. For some, Hyper-Threading is a necessity, but for others it will be an unnecessary waste of money. I will analyze both options and after reading the article (I hope) everyone will be able to independently assess which of these cases is theirs and, accordingly, will make WEIZED conclusions about the advisability of purchasing a processor with Hyper-Threading for themselves.

Hyper-Threading implies dividing data processing by the processor core into 2 parallel threads. The point is well captured by the following quote:

When a pause occurs during the execution of a thread on one of the logical processors (as a result of a cache miss, branch prediction error, or waiting for the result of a previous instruction), control is transferred to the thread in another logical processor. Thus, while one process is waiting, for example, for data from memory, the computing resources of the physical processor are used to process another process.

Applications that DO NOT need Hyper-Threading.

Hyper-Threading is NOT needed for:

  • 90% of computer games, both modern and those that will be released in the coming years;
  • office applications.

Justification for the uselessness of Hyper-Threading.

Hyper-Threading has a serious spread in productivity gains from 0% (i.e. complete uselessness) to 30% (which is very noticeable) which depends on the following factors:

1. Optimization of a single application for working with 8 or more threads.

If the application is not optimized for 8 threads, then Hyper-Threading will not provide any benefits.

In some cases, attempts by “untrained” software to work with 8 cores even result in the fact that an 8-thread processor shows worse results than its “younger brother” without Hyper-Threading.

2. CPU load percentage

The higher the processor load percentage, the more noticeable the impact of Hyper-Threading. And vice versa - at low load you will not notice its influence.

Based on these data, we can conclude that Hyper-Threading is NOT needed for:

  • 90% of computer games, modern and those that will be released in the coming years. They do not provide enough CPU load;
  • office applications.

Where is Hyper-Threading NEEDED?

  • The benefits are undeniable Hyper-Threading in 3D Max and in other prof. applications. In my experiments, this technology reduced rendering time by 30%;
  • Hyper-Threading is also useful for the TOP 10% of modern computer games (such as Crysis 3), as well as similar games that will be released in the future.

More reasons to use Hyper-Threading for gaming

Even though there are really few games on PC today that are optimized for 8 threads, I still think that buying an i7 with 8 threads makes sense, especially with an eye to the future.

Firstly gaming computers in my understanding, they should focus not on the majority of games, but on the best games. But in fact, today there are games optimized for 8 threads and providing 70+% CPU load.

Secondly, we can only expect improvements in games and, as a consequence, an increase in their demands on the CPU. Especially taking into account the fact that consoles ALREADY have 8 cores and this should be taken as the “bar” for gaming systems for the coming years.

I note that in this case we are not talking about the speculation of an individual blogger, but about the forecasts of two teams of the best professionals who work on platforms such as PS and XBox.

Thirdly, the processor ages 2-3 times slower than the video card. This fact allows you to replace the video card, say, in a year or two, and thus get the opportunity to enjoy new current games. But this is only possible if the processor can handle both the new video card and the new game. Otherwise, it will become a limiting link and will not allow the video card to show its full potential in any specific processor-demanding game.

Taking all three points into account, buying a processor with Hyper-Threading looks like a very reasonable decision for gaming computers.

There is information on the Internet about the uselessness Hyper-Trading basically.

On my own behalf, I decided to conduct a mini-test, rendering a small scene with Hyper-Trading turned on and off.

So first Hyper trading is turned off. Rendering time 188 sec.

Turn it on. Rendering time is reduced to 151 seconds.

Users who have at least once configured the BIOS have probably already noticed that there is an Intel Hyper Threading parameter that is unclear to many. Many people do not know what this technology is and for what purpose it is used. Let's try to figure out what Hyper Threading is and how you can enable the use of this support. We will also try to figure out what advantages this setting provides for computer operation. In principle, there is nothing difficult to understand here.

Intel Hyper Threading: what is it?
If we don’t go deep into the jungle of computer terminology, but to put it in simple terms, then this technology was developed in order to increase the flow of commands processed simultaneously by the central processor. Modern processor chips typically use only 70% of their available computing capabilities. The rest remains, so to speak, in reserve. As for processing the data stream, in most cases only one thread is used, despite the fact that the system uses a multi-core processor.

Basic operating principles
In order to increase the capabilities of the central processor, a special Hyper Threading technology was developed. This technology makes it easy to split one command stream into two. It is also possible to add a second thread to an existing one. Only such a thread is virtual and does not work at the physical level. This approach can significantly increase processor performance. The entire system, accordingly, begins to work faster. CPU performance gains can fluctuate quite a bit. This will be discussed separately. However, the developers of Hyper Threading technology themselves claim that it does not reach a full-fledged kernel. In some cases, the use of this technology is one hundred percent justified. If you know the essence of Hyper Threading processors, the result will not keep you waiting long.

Historical reference
Let's dive a little into the history of this development. Hyper Threading support first appeared only in Intel Pentium 4 processors. Later, the implementation of this technology was continued in the Intel Core iX series (X here stands for processor series). It is worth noting that for some reason it is missing from the Core 2 line of processor chips. True, at that time the productivity increase was quite weak: somewhere around 15-20%. This indicated that the processor did not have the necessary computing power, and the created technology was practically ahead of its time. Today, support for Hyper Threading technology is already available in almost all modern chips. To increase the power of the central processor, the process itself uses only 5% of the chip surface, leaving room for processing commands and data.

The issue of conflicts and performance
All this is of course good, but when processing data, in some cases there may be a slowdown. This is mostly due to the so-called branch prediction module and insufficient cache size when it is constantly reloaded. If we talk about the main module, then in this case the situation is such that in some cases the first thread may require data from the second, which may not be processed at that moment or is in the queue for processing. Also no less common are situations where the central processor core has a very heavy load, and despite this, the main module continues to send data to it. Some programs and applications, for example, resource-intensive online games, can seriously slow down only because they are not optimized for the use of Hyper Threading technology. What happens with games? The user's computer system, for its part, tries to optimize data flows from the application to the server. The problem is that the game does not know how to independently distribute data streams, lumping everything into one pile. By and large, it may simply not be designed for this. Sometimes in dual-core processors the performance increase is significantly higher than in 4-core processors. The latter simply do not have enough computing power.

How to enable Hyper Threading in BIOS?
We have already figured out a little about what Hyper Threading technology is and got acquainted with the history of its development. We are close to understanding what Hyper Threading technology is. How to activate this technology for use in the processor? Everything is done quite simply here. You must use the BIOS management subsystem. The subsystem is entered by using the keys Del, F1, F2, F3, F8, F12, F2+Del, etc. If you are using a Sony Vaio laptop, then there is a specific input for them when you use the dedicated ASSIST key. In the BIOS settings, if the processor you are using supports Hyper Threading technology, there should be a special setting line. In most cases it looks like Hyper Threading Technology, and sometimes like Function. Depending on the subsystem developer and BIOS version, this parameter may be configured either in the main menu or in advanced settings. To enable this technology, you must enter the options menu and set the value to Enabled. After this, you need to save the changes made and reboot the system.

How is Hyper Threading technology useful?
In conclusion, I would like to talk about the advantages that the use of Hyper Threading technology provides. What is all this for? Why is it necessary to increase processor power when processing information? Those users who work with resource-intensive applications and programs do not need to explain anything. Many people probably know that graphic, mathematical, and design software packages require a lot of system resources during operation. Because of this, the entire system is so loaded that it begins to slow down terribly. To prevent this from happening, it is recommended to activate Hyper Threading support.

15.03.2013

Hyper-Threading technology appeared in Intel processors, scary to say, more than 10 years ago. And at the moment it is an important element of Core processors. However, the question of the need for HT in games is still not completely clear. We decided to conduct a test to understand whether gamers need a Core i7, or if a Core i5 is better. And also find out how much better Core i3 is than Pentium.


Hyper-Threading Technology, developed by Intel and exclusively used in the company's processors, starting with the memorable Pentium 4, is something that is taken for granted at the moment. A significant number of processors of current and previous generations are equipped with it. It will be used in the near future.

And it must be admitted that Hyper-Threading technology is useful and has a positive effect on performance, otherwise Intel would not use it to position its processors within the line. And not as a secondary element, but one of the most important, if not the most important. To make it clear what we are talking about, we have prepared a table that makes it easy to evaluate the principle of segmentation of Intel processors.


As you can see, there are very few differences between the Pentium and Core i3, as well as between the Core i5 and Core i7. In fact, the i3 and i7 models differ from the Pentium and i5 only in the size of the third level cache per core (not counting the clock frequency, of course). The first pair has 1.5 megabytes, and the second pair has 2 megabytes. This difference cannot fundamentally affect the performance of processors, since the difference in cache size is very small. That is why Core i3 and Core i7 received support for Hyper-Threading technology, which is the main element that allows these processors to have a performance advantage over Pentium and Core i5, respectively.

As a result, a slightly larger cache and Hyper-Threading support will allow significantly higher prices for processors. For example, processors of the Pentium line (about 10 thousand tenge) are approximately two times cheaper than Core i3 (about 20 thousand tenge), and this despite the fact that physically, at the hardware level, they are absolutely identical, and, accordingly, have the same cost . The price difference between Core i5 (about 30 thousand tenge) and Core i7 (about 50 thousand tenge) is also very large, although less than two times in younger models.


How justified is this price increase? What real gain does Hyper-Threading provide? The answer has long been known: the increase varies, it all depends on the application and its optimization. We decided to check what HT can do in games, as one of the most demanding “household” applications. In addition, this test will be an excellent addition to our previous material on the effect of the number of cores in the processor on gaming performance.

Before moving on to the tests, let's remember (or find out) what Hyper-Threading Technology is. As Intel itself said when introducing this technology many years ago, there is nothing particularly complicated about it. In fact, all that is needed to introduce HT at the physical level is to add not one set of registers and an interrupt controller to one physical core, but two. In Pentium 4 processors, these additional elements increased the number of transistors by only five percent. In modern Ivy Bridge cores (as well as Sandy Bridge and future Haswell), the additional elements for even four cores do not increase the die by even 1 percent.


Additional registers and an interrupt controller, coupled with software support, allow the operating system to see not one physical core, but two logical ones. At the same time, the processing of data from two streams that are sent by the system still occurs on the same core, but with some features. One thread still has the entire processor at its disposal, but as soon as some CPU blocks are freed and idle, they are immediately given to the second thread. Thanks to this, it was possible to use all processor blocks simultaneously, and thereby increase its efficiency. As Intel itself stated, the performance increase under ideal conditions can reach up to 30 percent. True, these indicators are true only for the Pentium 4 with its very long pipeline; modern processors benefit from HT less.

But ideal conditions for Hyper-Threading are not always the case. And most importantly, the worst result of HT is not the lack of performance gain, but its decrease. That is, under certain conditions, the performance of a processor with HT will drop relative to a processor without HT due to the fact that the overhead costs of dividing threads and organizing a queue will significantly exceed the gain from calculating parallel threads, which is possible in this particular case. And such cases occur much more often than Intel would like. Moreover, many years of using Hyper-Threading have not improved the situation. This is especially true for games that are very complex and not at all standard in terms of data calculation and applications.

In order to find out the impact of Hyper-Threading on gaming performance, we again used our long-suffering Core i7-2700K test processor, and simulated four processors at once by disabling cores and turning HT on/off. Conventionally, they can be called Pentium (2 cores, HT disabled), Core i3 (2 cores, HT enabled), Core i5 (4 cores, HT disabled), and Core i7 (4 cores, HT enabled). Why conditional? First of all, because according to some characteristics they do not correspond to real products. In particular, disabling cores does not lead to a corresponding reduction in the volume of the third level cache - its volume for all is 8 megabytes. And, in addition, all our “conditional” processors operate at the same frequency of 3.5 gigahertz, which has not yet been achieved by all processors in the Intel line.


However, this is even for the better, since thanks to the constant change of all important parameters, we will be able to find out the real impact of Hyper-Threading on gaming performance without any reservations. And the percentage difference in performance between our “conditional” Pentium and Core i3 will be close to the difference between real processors, provided the frequencies are equal. It should also not be confusing that we are using a processor with Sandy Bridge architecture, since our efficiency tests, which you can read about in the article “Bare Performance - Examining the Efficiency of ALUs and FPUs,” showed that the influence of Hyper-Threading in the latest generations of processors Core remains unchanged. Most likely, this material will also be relevant for upcoming Haswell processors.

Well, it seems that all the questions regarding the testing methodology, as well as the operating features of Hyper-Threading Technology, have been discussed, and therefore it’s time to move on to the most interesting thing - the tests.

Even in a test in which we studied the impact of the number of processor cores on gaming performance, we found that 3DMark 11 is completely relaxed about CPU performance, working perfectly even on one core. Hyper-Threading had the same “powerful” influence. As you can see, the test does not notice any differences between Pentium and Core i7, not to mention intermediate models.

Metro 2033

But Metro 2033 clearly noticed the appearance of Hyper-Threading. And she reacted negatively to him! Yes, that's right: enabling HT in this game has a negative impact on performance. A small impact, of course - 0.5 frames per second with four physical cores, and 0.7 with two. But this fact gives every reason to say that the Metro 2033 Pentium is faster than the Core i3, and the Core i5 is better than the Core i7. This is confirmation of the fact that Hyper-Threading does not show its effectiveness always and not everywhere.

Crysis 2

This game showed very interesting results. First of all, we note that the influence of Hyper-Threading is clearly visible in dual-core processors - the Core i3 is ahead of the Pentium by almost 9 percent, which is quite a lot for this game. Victory for HT and Intel? Not really, since the Core i7 did not show any gain relative to the noticeably cheaper Core i5. But there is a reasonable explanation for this - Crysis 2 cannot use more than four data streams. Because of this, we see a good increase in the dual-core with HT - still, four threads, albeit logical, are better than two. On the other hand, there was nowhere to put additional Core i7 threads; four physical cores were quite enough. So, based on the results of this test, we can note the positive impact of HT in the Core i3, which is noticeably better than the Pentium here. But among quad-core processors, the Core i5 again looks like a more reasonable solution.

Battlefield 3

The results here are very strange. If in the test for the number of cores, battlefield was an example of a microscopic but linear increase, then the inclusion of Hyper-Threading introduced chaos into the results. In fact, we can state that the Core i3, with its two cores and HT, turned out to be the best of all, ahead of even the Core i5 and Core i7. It’s strange, of course, but at the same time, Core i5 and Core i7 were again on the same level. What explains this is not clear. Most likely, the testing methodology in this game played a role here, which gives greater errors than standard benchmarks.

In the last test, F1 2011 proved to be one of the games that is very critical of the number of cores, and in this test it again surprised us with the excellent impact of Hyper-Threading technology on the performance. And again, as in Crysis 2, the inclusion of HT worked very well on dual-core processors. Look at the difference between our conditional Core i3 and Pentium - it is more than twofold! It is clearly visible that the game is very much lacking two cores, and at the same time its code is so well parallelized that the effect is amazing. On the other hand, you can’t argue with four physical cores - Core i5 is noticeably faster than Core i3. But the Core i7, again, as in previous games, did not show anything outstanding compared to the Core i5. The reason is the same - the game cannot use more than 4 threads, and the overhead of running HT reduces the performance of the Core i7 below the level of the Core i5.

An old warrior needs Hyper-Threading no more than a hedgehog needs a T-shirt - its influence is by no means as clearly noticeable as in F1 2011 or Crysis 2. However, we still note that turning on HT on a dual-core processor brought 1 extra frame. This is certainly not enough to say that Core i3 is better than Pentium. At the very least, this improvement clearly does not correspond to the difference in price of these processors. And it’s not even worth mentioning the price difference between Core i5 and Core i7, since the processor without HT support again turned out to be faster. And noticeably faster - by 7 percent. Whatever one may say, we again state the fact that four threads is the maximum for this game, and therefore HyperThreading in this case does not help the Core i7, but hinders.

Hyper-Threading (hyper threading, 'hyper threading', hyper threading - Russian) - technology developed by the company Intel, allowing the processor core to execute more than one (usually two) data threads. Since it was found that a typical processor in most tasks uses no more than 70% of all the computing power, it was decided to use a technology that allows, when certain computing units are idle, to load them with work with another thread. This allows you to increase kernel performance from 10 to 80% depending on the task.

Understanding how Hyper-Threading works .

Let's say the processor performs simple calculations and at the same time the block of instructions is idle and SIMD extensions.

The addressing module detects this and sends data there for subsequent calculation. If the data is specific, then these blocks will execute them more slowly, but the data will not be idle. Or they will pre-process them for further rapid processing by the appropriate block. This gives additional performance gains.

Naturally, the virtual thread does not reach a full-fledged kernel, but this allows you to achieve almost 100% efficiency of computing power, loading almost the entire processor with work, preventing it from being idle. With all this, to implement HT technology it only takes about 5% additional space on the chip, and performance can sometimes be added to 50% . This additional area includes additional register blocks and branch predictions, which stream-calculate where computing power can currently be used and send data there from the additional addressing block.

For the first time, the technology appeared on processors Pentium 4, but there was no big increase in performance, since the processor itself did not have high computing power. The increase was at best 15-20% , and in many tasks the processor worked much slower than without HT.

Slowdown processor due to technology Hyper Threading, occurs if:

  • Insufficient cache for all this and it reboots cyclically, slowing down the processor.
  • The data cannot be processed correctly by the branch predictor. Occurs mainly due to lack of optimization for certain software or support from the operating system.
  • It may also occur due to data dependencies, when, for example, the first thread requires immediate data from the second, but it is not ready yet, or is in line for another thread. Or cyclic data requires certain blocks for fast processing, and they are loaded with other data. There can be many variations of data dependency.
  • If the core is already heavily loaded, and the “insufficiently smart” branch prediction module still sends data that slows down the processor (relevant for Pentium 4).

After Pentium 4, Intel started using technology only starting from Core i7 first generation, skipping the series 2 .

The computing power of processors has become sufficient for the full implementation of hyperthreading without much harm, even for unoptimized applications. Later, Hyper-Threading appeared on mid-class and even budget and portable processors. Used on all series Core i (i3; i5; i7) and on mobile processors Atom(not at all). What's interesting is that dual-core processors with HT, get a greater performance gain than quad-core ones from using Hyper-Threading, standing on 75% full-fledged quad-nuclear.

Where is HyperThreading technology useful?

It will be useful for use in conjunction with professional, graphic, analytical, mathematical and scientific programs, video and audio editors, archivers ( Photoshop, Corel Draw, Maya, 3D’s Max, WinRar, Sony Vegas & etc). All programs that use a large number of calculations, HT will definitely be useful useful. Fortunately, in 90% cases, such programs are well optimized for its use.

HyperThreading indispensable for server systems. Actually, it was partially developed for this niche. Thanks to HT, you can significantly increase the output of the processor when there are a large number of tasks. Each thread will be unloaded by half, which has a beneficial effect on data addressing and branch prediction.

Many computer games, have a negative attitude towards the presence Hyper-Threading, due to which the number of frames per second decreases. This is due to the lack of optimization for Hyper-Threading from the game side. Optimization on the part of the operating system alone is not always enough, especially when working with unusual, diverse and complex data.

On motherboards that support HT, you can always disable hyperthreading technology.

January 20, 2015 at 07:43 pm

Once again about Hyper-Threading

  • IT systems testing,
  • Programming

There was a time when it was necessary to evaluate memory performance in the context of Hyper-threading technology. We have come to the conclusion that its influence is not always positive. When a quantum of free time appeared, there was a desire to continue research and consider the ongoing processes with an accuracy of machine clock cycles and bits, using software of our own design.

Platform under study

The object of the experiments is an ASUS N750JK laptop with an Intel Core i7-4700HQ processor. Clock frequency 2.4GHz, increased in Intel Turbo Boost mode up to 3.4GHz. Installed 16 gigabytes of DDR3-1600 RAM (PC3-12800), operating in dual-channel mode. Operating system – Microsoft Windows 8.1 64 bit.

Fig.1 Configuration of the platform under study.

The processor of the platform under study contains 4 cores, which, when Hyper-Threading technology is enabled, provides hardware support for 8 threads or logical processors. The platform firmware transmits this information to the operating system via the ACPI table MADT (Multiple APIC Description Table). Since the platform contains only one RAM controller, there is no SRAT (System Resource Affinity Table) table, which declares the proximity of processor cores to memory controllers. Obviously, the laptop under study is not a NUMA platform, but the operating system, for the purpose of unification, considers it as a NUMA system with one domain, as indicated by the line NUMA Nodes = 1. A fact that is fundamental for our experiments is that the first-level data cache has size 32 kilobytes for each of the four cores. Two logical processors sharing one core share the L1 and L2 caches.

Operation under study

We will study the dependence of the reading speed of a data block on its size. To do this, we will choose the most productive method, namely reading 256-bit operands using the AVX instruction VMOVAPD. In the graphs, the X axis shows the block size, and the Y axis shows the reading speed. Around point X, which corresponds to the size of the L1 cache, we expect to see an inflection point, since performance should drop after the processed block leaves the cache limits. In our test, in the case of multi-threaded processing, each of the 16 initiated threads works with a separate address range. To control Hyper-Threading technology within the application, each thread uses the SetThreadAffinityMask API function, which sets a mask in which one bit corresponds to each logical processor. A single bit value allows the specified processor to be used by a given thread, a zero value prohibits it. For 8 logical processors of the platform under study, mask 11111111b allows the use of all processors (Hyper-Threading is enabled), mask 01010101b allows the use of one logical processor in each core (Hyper-Threading is disabled).

The following abbreviations are used in the graphs:

MBPS (Megabytes per Second)block reading speed in megabytes per second;

CPI (Clocks per Instruction)number of clock cycles per instruction;

TSC (Time Stamp Counter)CPU cycle counter.

Note: The TSC register clock speed may not match the processor clock speed when running in Turbo Boost mode. This must be taken into account when interpreting the results.

On the right side of the graphs, a hexadecimal dump of the instructions that make up the loop body of the target operation executed in each of the program threads, or the first 128 bytes of this code, is visualized.

Experience No. 1. One thread



Fig.2 Single thread reading

The maximum speed is 213563 megabytes per second. The inflection point occurs at a block size of about 32 kilobytes.

Experience No. 2. 16 threads on 4 processors, Hyper-Threading disabled



Fig.3 Reading with sixteen threads. The number of logical processors used is four

Hyper-Threading is disabled. Maximum speed 797598 megabytes per second. The inflection point occurs at a block size of about 32 kilobytes. As expected, compared to reading with one thread, the speed increased by approximately 4 times, based on the number of working cores.

Experience No. 3. 16 threads on 8 processors, Hyper-Threading enabled



Fig.4 Reading with sixteen threads. The number of logical processors used is eight

Hyper-Threading is enabled. The maximum speed is 800,722 megabytes per second; as a result of enabling Hyper-Threading, it almost did not increase. The big minus is that the inflection point occurs at a block size of about 16 kilobytes. Enabling Hyper-Threading slightly increased the maximum speed, but the speed drop now occurs at half the block size - about 16 kilobytes, so the average speed has dropped significantly. This is not surprising, each core has its own L1 cache, while the logical processors of the same core share it.

conclusions

The operation studied scales quite well on a multi-core processor. Reasons: Each core contains its own L1 and L2 cache, the target block size is comparable to the cache size, and each thread works with its own address range. For academic purposes, we created these conditions in a synthetic test, recognizing that real-world applications are usually far from ideal optimization. But enabling Hyper-Threading, even under these conditions, had a negative effect; with a slight increase in peak speed, there is a significant loss in the processing speed of blocks whose size ranges from 16 to 32 kilobytes.
views