Groups | Blog | Home
all groups > visual c > july 2003 >

visual c : sse


bill
7/29/2003 5:30:15 AM
I am working on a project with alot of array manipulations (sin,cos,mult).
Does anyone know of a package utilizing simd (sse or mmx) to increase the
processing capability. ? (particularly for the sin/cos)

Gotta go fast!
Thanks,
Bill

George M. Garner Jr.
7/29/2003 10:29:53 AM
Bill,

Take a look at http://www.codeproject.com/cpp/mmxintro.asp.

Regards,

George.


bill
7/29/2003 3:18:38 PM
thanks - now I just need a few weeks to figure it out!

[quoted text, click to view]

Carl Daniel [VC++ MVP]
7/29/2003 3:22:14 PM
[quoted text, click to view]

IIRC, MMX/SSE/SSE2 won't help with the trig functions, but can definitely be
used to optimize matrix multiplication.

-cd

Brandon Bray [MSFT]
7/29/2003 5:51:38 PM
[quoted text, click to view]

Of course, inline assembly isn't always necessary. Often it is better to use
intrinsic functions (which are also documented with the SSE/SSE2 support in
Visual C++). The compiler is able to deal with intrinsic functions better
because it can do optimizations beyond what you could do with inline
assembly, and its more portable between different architectures.

Just my two cents. Cheerio!

--
Brandon Bray Visual C++ Compiler
This posting is provided AS IS with no warranties, and confers no rights.

Brandon Bray [MSFT]
7/29/2003 5:56:01 PM
[quoted text, click to view]

You're right. Although, there are some benefits to using the SSE/SSE2
registers for the trig functions rather than using the x87 FP stack. Over
time, the processors will optimize for register architectures. Already,
compilers handle register architectures better than stack architectures (as
evididenced in much better optimization for integer code). Anyways, some
trig routines supplied by Visual C++ will use SSE/SSE2 instructions after
first checking the CPU ID.

Don't forget that the compiler can also generate SSE/SSE2 instructions when
given either the /arch:SSE or /arch:SSE2 switches.

Hope that helps. Cheerio!

--
Brandon Bray Visual C++ Compiler
This posting is provided AS IS with no warranties, and confers no rights.

Andre
7/29/2003 11:34:36 PM
Just wanted to know.. what really is sse and how can it improve code
performance? Thanks

-Andre

[quoted text, click to view]
Teis Draiby
7/30/2003 2:04:12 AM
I've used SSE2 (Streaming SIMD Extensions) instructions for performance
critical loops in inline assembly blocks directly in my VC++ code.
I'm not an experinced SSE2 programmer, but I'll share my knoledge anyway.
There are other ways to get use of SSE2, see below.

If you want to use SSE2 instructions directly, Intels 'IA-32 Architecture
Software Developer's Manual' tells you about everything you need to know
when it comes to MMX/SSE/SSE2. You can get it
ftp://download.intel.com/design/Pentium4/manuals/24547012.pdf . -down to
very low-level though. No code examples either. -Takes a lot of coffee.


SSE2 is an extension to SSE and MMX. They all use the SIMD - 'single
instruction multiple data' -model. That means that with a single instruction
you can perform an operation on up to four 32-bit numbers - simultaniously.
Therefore using SIMD increases the speed significantly. You can use SIMD
instructions in cases where you need to perform the same operation on a
large amount of similar data, like video or 3D applications.

MMX: Instruction set that operates on 64 bit registers, for example
containing two 32 bit integers -simultaneously. Unfortunately MMX only
includes instructions that operates on integers.
The eight MMX registers are called MM0 - MM7 (An easy way to identify the
use of MMX in assembly code).

SSE: extends MMX so that you can also do floating point operations on 128
bit registers. With SSE you can operate on four 32 bit floating point
numbers, or two 64 bit double precision floating point numbers -
simultaneously. Came with the Intel P3 processors.
The eight SSE registers are called XMM0 - XMM7.

SSE2: Yet another extension. Includes all SSE operations but adds integer
operations. Now you can work with four 32 bit integers. Introduced with the
P4 processors.
SSE2 uses the same registers as SSE.

SSE2/SSE are processor specific. SSE instructions work only on Intel P3
processors and later. SSE2 instructions work only on Intel P4 processors and
not on AMD processors. The AMD eqvivalent to SSE/SSE2 is called '3DNow!'.


You can use SSE2 instructions directly in Visual Studio in an inline
assembly '__asm' block, as I have done in very performance dependent loops.

As said, there are also other ways to utilize the SSE2 registers, which I
have no experience of:
I think you can allow the compiler to use SIMD instructions when
interpreting your c++-code. I don't know how.
You can also use SIMD 'Intrinsics' which is C++ instructions that
specifically use SIMD instructions.
+... ?

regards, Teis



[quoted text, click to view]

Teis Draiby
7/30/2003 10:31:17 PM
[quoted text, click to view]

-I got an "Command line warning D4002 : ignoring unknown option
'/arch:SSE2".
Is Visual Studio C++ .NET 2000 able to recognize this switch?


Thanks Teis


AddThis Social Bookmark Button