all groups > dotnet clr > june 2005 >
dotnet clr :
Threading scenario - best approach ?
Jon, How would the timeout be implemented using a Monitor ? [quoted text, click to view] "Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message news:MPG.1d1bf5a05a7d491498c322@msnews.microsoft.com... Andreas Håkansson <andreas@spamproof.selfinflicted.org> wrote: > Thanks for your feedback. I've been thinking about leveraging a > timeout so that the collecting of data wont block indefinitely. I saw > that both the Join and WaitAll methods accepted an optional timeout > parameter. > > However the functionality provided by them aren't interchangable > since using a timeout the Join method will make the first thread run > a maximum time of x (the timeout), the next thread will run 2*x, the > next 3*x and so on. With WaitAll, all threads will get the same > change to execute before the method stops blocking the execution of > the main thread.
True. Note, however, that there is an alternative to using Auto/ManualResetEvents - you can use Monitor.Wait and Monitor.Notify. Personally, I prefer these - they feel more idiomatic .NET somehow, rather than being Win32 shims. (They also perform very slightly better if I remember rightly, but the difference isn't significant.) You could make each worker thread decrement a counter (which is set by the main thread) and when the last worker thread decrements it to 0, it could notify the monitor. [quoted text, click to view] > The timeout, however, makes me wonder about the left worker threads. > They will continute executing in the background until they are > finished. Do I have to clean them up myself, if so then how? What > about, for example, if of the worker threads calls a webservice and > for some reason is unable to establish a connection, leaving it > waiting for it's own timeout which could have been increased beyond > the default time. This would leave the worker threads hanging around > for a long time even though the main thread timed out and continued > executing.. =/
See http://www.pobox.com/~skeet/csharp/threads/shutdown.shtml for general guidance about stopping tasks in a controlled way. -- Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet If replying to the group, please do not mail me too
[quoted text, click to view] Andreas H=E5kansson <andreas@spamproof.selfinflicted.org> wrote: > How would the timeout be implemented using a Monitor ?
Using the call to Wait which takes a timeout. --=20 Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet
[quoted text, click to view] john conwell <johnconwell@discussions.microsoft.com> wrote: > if you go with manually creating your own threads, i'm a bigger fan of using > an AutoResetEvent with a WaitAll() call, rather than Join(). It just seems > more elegant for managing a large colleciton of threads
I suspect I'm biased because of my history here. Coming from a Java background, I'm very familiar and happy with Monitor.Wait/Pulse etc, but not so happy with *ResetEvents. I know of other developers who've come from a Win32 background and feel exactly the opposite. Either will work perfectly well, of course :) -- Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet
Hi, I would recommend using your first scenario, as it is simple and straightforward. Using the threadpool would introduce additional complications due to the maximum limit on threadpool thread count, and would not really improve performance as we're talking about 5-15 second intervals. The async delegates are essentially only a wrapper around threadpool, so it's the same as above. HTH, Stefan [quoted text, click to view] Netveloper wrote: > Hi, > > In one of my classes I have a method, lets call it Fetch, which will collect > data > from various sources and return the combined result. Each of the sources > can > take between 5-15 seconds to collect so I would like to incrcease the > performance by introducing multi-threading support for the actuall > collecting of > data. So the Fetch method should spawn of the works and block until all of > the > workers has finished (or failed). > > I have done some lite reading and would like some feedback on what the best > approch would be to implement this scenario. > > SCENARIO #1 - Using Thread > > I thought about creating a worker thread which will collect information from > a > source and return the result. This worker thread would have to be able to > take > two parameters (used to determine what data to get) and return an array of > objects. > > The Fetch method could create a new worker object for each datasource, pass > the correct (two) parameters for it, inform it about which callback to use > to signal > it's completion and pass back the return data to and then start it in a new > thread. > > Once all of the worker threads is running the Fetch method would enter > something > like this (note VB.NET as example, could just as well be C# since I code in > both) > > For Each WorkerThread In Workers > WorkerThread.Join() > Next > > Return CombinedResult > > Thoughts and/or suggestions? Advantages/Disadvantages? > > > SCENARIO #2 - ThreadPool > > Just like SCENARIO #1 but I would use the ThreadPool instead. How would I > wait for all the threads to finish before returning, i.e blocking the Fetch > method until > all workes has finished (or failed) ? > > Thoughts and/or suggestions? Advantages/Disadvantages? > > > SCENARIO #3 - Async Delegates > > I create a delegate which takes my worker process as a parameter. The > delegate is then > called using the async method, BeginInvoke and use an AsyncCallback to > gather and > combin the worker results. I would probably built this using the technique > posted by Mike > Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list > > (watch for line-wrapping) > http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-3&P=2534 > > to ensure EndInvoke was called, this avoiding a possible memory leak. If I > went down this > road, how would I make the Fetch method block until all of the async > operations had finished > (or failed) without having to resort to a busy-wait ? > > > All thoughts and suggestions will be apprechiated on this subject. > Thanks! > > >
Stefan, Thank you for your thoughts. I'm also leaning towards scenario #1 and have started writing a small prototyp. How would you suggest I wait for the workers to finish before the Fetch method return? I could perhaps do as described below by calling Join on all worker threads or perhaps pass an Auto/ManualResetEvent to each worker and have them singnal completion and in the Fetch method I'd call WaitHandler.WaitAll "Stefan Simek" <nospam@nospam.nospam> skrev i meddelandet news:OovxpulcFHA.1504@TK2MSFTNGP15.phx.gbl... [quoted text, click to view] > Hi, > > I would recommend using your first scenario, as it is simple and > straightforward. > Using the threadpool would introduce additional complications due to the > maximum limit on threadpool thread count, and would not really improve > performance as we're talking about 5-15 second intervals. > The async delegates are essentially only a wrapper around threadpool, so > it's the same as above. > > HTH, > Stefan > > Netveloper wrote: >> Hi, >> >> In one of my classes I have a method, lets call it Fetch, which will >> collect data >> from various sources and return the combined result. Each of the sources >> can >> take between 5-15 seconds to collect so I would like to incrcease the >> performance by introducing multi-threading support for the actuall >> collecting of >> data. So the Fetch method should spawn of the works and block until all >> of the >> workers has finished (or failed). >> >> I have done some lite reading and would like some feedback on what the >> best >> approch would be to implement this scenario. >> >> SCENARIO #1 - Using Thread >> >> I thought about creating a worker thread which will collect information >> from a >> source and return the result. This worker thread would have to be able to >> take >> two parameters (used to determine what data to get) and return an array >> of >> objects. >> >> The Fetch method could create a new worker object for each datasource, >> pass >> the correct (two) parameters for it, inform it about which callback to >> use to signal >> it's completion and pass back the return data to and then start it in a >> new thread. >> >> Once all of the worker threads is running the Fetch method would enter >> something >> like this (note VB.NET as example, could just as well be C# since I code >> in both) >> >> For Each WorkerThread In Workers >> WorkerThread.Join() >> Next >> >> Return CombinedResult >> >> Thoughts and/or suggestions? Advantages/Disadvantages? >> >> >> SCENARIO #2 - ThreadPool >> >> Just like SCENARIO #1 but I would use the ThreadPool instead. How would I >> wait for all the threads to finish before returning, i.e blocking the >> Fetch method until >> all workes has finished (or failed) ? >> >> Thoughts and/or suggestions? Advantages/Disadvantages? >> >> >> SCENARIO #3 - Async Delegates >> >> I create a delegate which takes my worker process as a parameter. The >> delegate is then >> called using the async method, BeginInvoke and use an AsyncCallback to >> gather and >> combin the worker results. I would probably built this using the >> technique posted by Mike >> Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list >> >> (watch for line-wrapping) >> http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-3&P=2534 >> >> to ensure EndInvoke was called, this avoiding a possible memory leak. If >> I went down this >> road, how would I make the Fetch method block until all of the async >> operations had finished >> (or failed) without having to resort to a busy-wait ? >> >> >> All thoughts and suggestions will be apprechiated on this subject. >> Thanks! >> >> >> >
Hi, I think you can try both, but I guess the Join method will be OK, no need to introduce another synchronization mechanism. Calling Join() on a thread that has already finished will return immediately, so the foreach .... Join will do exactly what is expected - finish after all the threads are done. But I'm not trying to push you into anything - use the approach you are most comfortable with. Stefan [quoted text, click to view] Netveloper wrote: > Stefan, > > Thank you for your thoughts. I'm also leaning towards scenario #1 and have > started writing a small prototyp. How would you suggest I wait for the > workers > to finish before the Fetch method return? I could perhaps do as described > below > by calling Join on all worker threads or perhaps pass an > Auto/ManualResetEvent > to each worker and have them singnal completion and in the Fetch method I'd > call WaitHandler.WaitAll > > > > > "Stefan Simek" <nospam@nospam.nospam> skrev i meddelandet > news:OovxpulcFHA.1504@TK2MSFTNGP15.phx.gbl... > >>Hi, >> >>I would recommend using your first scenario, as it is simple and >>straightforward. >>Using the threadpool would introduce additional complications due to the >>maximum limit on threadpool thread count, and would not really improve >>performance as we're talking about 5-15 second intervals. >>The async delegates are essentially only a wrapper around threadpool, so >>it's the same as above. >> >>HTH, >>Stefan >> >>Netveloper wrote: >> >>>Hi, >>> >>>In one of my classes I have a method, lets call it Fetch, which will >>>collect data >>>from various sources and return the combined result. Each of the sources >>>can >>>take between 5-15 seconds to collect so I would like to incrcease the >>>performance by introducing multi-threading support for the actuall >>>collecting of >>>data. So the Fetch method should spawn of the works and block until all >>>of the >>>workers has finished (or failed). >>> >>>I have done some lite reading and would like some feedback on what the >>>best >>>approch would be to implement this scenario. >>> >>>SCENARIO #1 - Using Thread >>> >>>I thought about creating a worker thread which will collect information >>>from a >>>source and return the result. This worker thread would have to be able to >>>take >>>two parameters (used to determine what data to get) and return an array >>>of >>>objects. >>> >>>The Fetch method could create a new worker object for each datasource, >>>pass >>>the correct (two) parameters for it, inform it about which callback to >>>use to signal >>>it's completion and pass back the return data to and then start it in a >>>new thread. >>> >>>Once all of the worker threads is running the Fetch method would enter >>>something >>>like this (note VB.NET as example, could just as well be C# since I code >>>in both) >>> >>> For Each WorkerThread In Workers >>> WorkerThread.Join() >>> Next >>> >>> Return CombinedResult >>> >>>Thoughts and/or suggestions? Advantages/Disadvantages? >>> >>> >>>SCENARIO #2 - ThreadPool >>> >>>Just like SCENARIO #1 but I would use the ThreadPool instead. How would I >>>wait for all the threads to finish before returning, i.e blocking the >>>Fetch method until >>>all workes has finished (or failed) ? >>> >>>Thoughts and/or suggestions? Advantages/Disadvantages? >>> >>> >>>SCENARIO #3 - Async Delegates >>> >>>I create a delegate which takes my worker process as a parameter. The >>>delegate is then >>>called using the async method, BeginInvoke and use an AsyncCallback to >>>gather and >>>combin the worker results. I would probably built this using the >>>technique posted by Mike >>>Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list >>> >>>(watch for line-wrapping) >>> http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-3&P=2534 >>> >>>to ensure EndInvoke was called, this avoiding a possible memory leak. If >>>I went down this >>>road, how would I make the Fetch method block until all of the async >>>operations had finished >>>(or failed) without having to resort to a busy-wait ? >>> >>> >>>All thoughts and suggestions will be apprechiated on this subject. >>>Thanks! >>> >>> >>> >> >
Hi, No pressure felt :) I've gotten both to work, equally well and I would just like to understand the difference in approach. I guess there are advantages/disadvantages with using either of the approaches. Don't really like to use code without understanding exaclty what it is doing ;) "Stefan Simek" <nospam@nospam.nospam> skrev i meddelandet news:uMWeXXmcFHA.2420@TK2MSFTNGP15.phx.gbl... [quoted text, click to view] > Hi, > > I think you can try both, but I guess the Join method will be OK, no need > to introduce another synchronization mechanism. Calling Join() on a thread > that has already finished will return immediately, so the foreach ... Join > will do exactly what is expected - finish after all the threads are done. > > But I'm not trying to push you into anything - use the approach you are > most comfortable with. > > Stefan > > Netveloper wrote: >> Stefan, >> >> Thank you for your thoughts. I'm also leaning towards scenario #1 and >> have >> started writing a small prototyp. How would you suggest I wait for the >> workers >> to finish before the Fetch method return? I could perhaps do as described >> below >> by calling Join on all worker threads or perhaps pass an >> Auto/ManualResetEvent >> to each worker and have them singnal completion and in the Fetch method >> I'd >> call WaitHandler.WaitAll >> >> >> >> >> "Stefan Simek" <nospam@nospam.nospam> skrev i meddelandet >> news:OovxpulcFHA.1504@TK2MSFTNGP15.phx.gbl... >> >>>Hi, >>> >>>I would recommend using your first scenario, as it is simple and >>>straightforward. >>>Using the threadpool would introduce additional complications due to the >>>maximum limit on threadpool thread count, and would not really improve >>>performance as we're talking about 5-15 second intervals. >>>The async delegates are essentially only a wrapper around threadpool, so >>>it's the same as above. >>> >>>HTH, >>>Stefan >>> >>>Netveloper wrote: >>> >>>>Hi, >>>> >>>>In one of my classes I have a method, lets call it Fetch, which will >>>>collect data >>>>from various sources and return the combined result. Each of the >>>>sources can >>>>take between 5-15 seconds to collect so I would like to incrcease the >>>>performance by introducing multi-threading support for the actuall >>>>collecting of >>>>data. So the Fetch method should spawn of the works and block until all >>>>of the >>>>workers has finished (or failed). >>>> >>>>I have done some lite reading and would like some feedback on what the >>>>best >>>>approch would be to implement this scenario. >>>> >>>>SCENARIO #1 - Using Thread >>>> >>>>I thought about creating a worker thread which will collect information >>>>from a >>>>source and return the result. This worker thread would have to be able >>>>to take >>>>two parameters (used to determine what data to get) and return an array >>>>of >>>>objects. >>>> >>>>The Fetch method could create a new worker object for each datasource, >>>>pass >>>>the correct (two) parameters for it, inform it about which callback to >>>>use to signal >>>>it's completion and pass back the return data to and then start it in a >>>>new thread. >>>> >>>>Once all of the worker threads is running the Fetch method would enter >>>>something >>>>like this (note VB.NET as example, could just as well be C# since I code >>>>in both) >>>> >>>> For Each WorkerThread In Workers >>>> WorkerThread.Join() >>>> Next >>>> >>>> Return CombinedResult >>>> >>>>Thoughts and/or suggestions? Advantages/Disadvantages? >>>> >>>> >>>>SCENARIO #2 - ThreadPool >>>> >>>>Just like SCENARIO #1 but I would use the ThreadPool instead. How would >>>>I >>>>wait for all the threads to finish before returning, i.e blocking the >>>>Fetch method until >>>>all workes has finished (or failed) ? >>>> >>>>Thoughts and/or suggestions? Advantages/Disadvantages? >>>> >>>> >>>>SCENARIO #3 - Async Delegates >>>> >>>>I create a delegate which takes my worker process as a parameter. The >>>>delegate is then >>>>called using the async method, BeginInvoke and use an AsyncCallback to >>>>gather and >>>>combin the worker results. I would probably built this using the >>>>technique posted by Mike >>>>Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list >>>> >>>>(watch for line-wrapping) >>>> http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-3&P=2534 >>>> >>>>to ensure EndInvoke was called, this avoiding a possible memory leak. If >>>>I went down this >>>>road, how would I make the Fetch method block until all of the async >>>>operations had finished >>>>(or failed) without having to resort to a busy-wait ? >>>> >>>> >>>>All thoughts and suggestions will be apprechiated on this subject. >>>>Thanks! >>>> >>>> >>>> >>> >>
Hi, In one of my classes I have a method, lets call it Fetch, which will collect data from various sources and return the combined result. Each of the sources can take between 5-15 seconds to collect so I would like to incrcease the performance by introducing multi-threading support for the actuall collecting of data. So the Fetch method should spawn of the works and block until all of the workers has finished (or failed). I have done some lite reading and would like some feedback on what the best approch would be to implement this scenario. SCENARIO #1 - Using Thread I thought about creating a worker thread which will collect information from a source and return the result. This worker thread would have to be able to take two parameters (used to determine what data to get) and return an array of objects. The Fetch method could create a new worker object for each datasource, pass the correct (two) parameters for it, inform it about which callback to use to signal it's completion and pass back the return data to and then start it in a new thread. Once all of the worker threads is running the Fetch method would enter something like this (note VB.NET as example, could just as well be C# since I code in both) For Each WorkerThread In Workers WorkerThread.Join() Next Return CombinedResult Thoughts and/or suggestions? Advantages/Disadvantages? SCENARIO #2 - ThreadPool Just like SCENARIO #1 but I would use the ThreadPool instead. How would I wait for all the threads to finish before returning, i.e blocking the Fetch method until all workes has finished (or failed) ? Thoughts and/or suggestions? Advantages/Disadvantages? SCENARIO #3 - Async Delegates I create a delegate which takes my worker process as a parameter. The delegate is then called using the async method, BeginInvoke and use an AsyncCallback to gather and combin the worker results. I would probably built this using the technique posted by Mike Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list (watch for line-wrapping) http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-3&P=2534 to ensure EndInvoke was called, this avoiding a possible memory leak. If I went down this road, how would I make the Fetch method block until all of the async operations had finished (or failed) without having to resort to a busy-wait ? All thoughts and suggestions will be apprechiated on this subject. Thanks!
First, are the Fetch methods getting data on a different server? or the server the app is running on? does the server that the threads are running under have multiple procs or just one. if its just one, then you are more likely to slow your app down then speed it up. the same amount of processing has to get done, but now you are tossing on thread mgt and context switching into the mix. only do this if your app is distributed or the server is multi-proc. As far as which way to go, I'm going to have to disagree. I'd go with solution 3, async delegates. First async delegates have a simple way to wait for all threads to finish. just collect all the returned IAsyncResult.AsyncWaitHandles into an array and call WaitHandle.WaitAll, passing in the array. This will pause the main thread until all delegates are finished running. doesnt get much easier. Also, as far as performance goes, the perf cost of initializing 5 - 10 new manual threads is much more than utilizing the pre-existing threads already initialized in the thread pool. As far as a threadpool max count is concerned, this shouldnt be an issue either. If you call ThreadPool.GetMaxThreads you'll see how many threads can be created in the pool. On my system its 100 (not sure if this is different per OS version or not). And if your plan on running more than 100 async tasks you should rethink this also, as this would probably bog down the CPU with all the processing and context switching. The threadpool can manage multiple threads quite well, and by the time you are ready to kick off your last thread, the first thread might be finished. in that case the thread pool will just reuse an existing thread instead of create another. Remember creating threads is a fairly significant performance hit. [quoted text, click to view] "Netveloper" wrote: > Hi, > > In one of my classes I have a method, lets call it Fetch, which will collect > data > from various sources and return the combined result. Each of the sources > can > take between 5-15 seconds to collect so I would like to incrcease the > performance by introducing multi-threading support for the actuall > collecting of > data. So the Fetch method should spawn of the works and block until all of > the > workers has finished (or failed). > > I have done some lite reading and would like some feedback on what the best > approch would be to implement this scenario. > > SCENARIO #1 - Using Thread > > I thought about creating a worker thread which will collect information from > a > source and return the result. This worker thread would have to be able to > take > two parameters (used to determine what data to get) and return an array of > objects. > > The Fetch method could create a new worker object for each datasource, pass > the correct (two) parameters for it, inform it about which callback to use > to signal > it's completion and pass back the return data to and then start it in a new > thread. > > Once all of the worker threads is running the Fetch method would enter > something > like this (note VB.NET as example, could just as well be C# since I code in > both) > > For Each WorkerThread In Workers > WorkerThread.Join() > Next > > Return CombinedResult > > Thoughts and/or suggestions? Advantages/Disadvantages? > > > SCENARIO #2 - ThreadPool > > Just like SCENARIO #1 but I would use the ThreadPool instead. How would I > wait for all the threads to finish before returning, i.e blocking the Fetch > method until > all workes has finished (or failed) ? > > Thoughts and/or suggestions? Advantages/Disadvantages? > > > SCENARIO #3 - Async Delegates > > I create a delegate which takes my worker process as a parameter. The > delegate is then > called using the async method, BeginInvoke and use an AsyncCallback to > gather and > combin the worker results. I would probably built this using the technique > posted by Mike > Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list > > (watch for line-wrapping) > http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-3&P=2534 > > to ensure EndInvoke was called, this avoiding a possible memory leak. If I > went down this > road, how would I make the Fetch method block until all of the async > operations had finished > (or failed) without having to resort to a busy-wait ? > > > All thoughts and suggestions will be apprechiated on this subject. > Thanks! > > >
if you go with manually creating your own threads, i'm a bigger fan of using an AutoResetEvent with a WaitAll() call, rather than Join(). It just seems more elegant for managing a large colleciton of threads [quoted text, click to view] "Netveloper" wrote: > Stefan, > > Thank you for your thoughts. I'm also leaning towards scenario #1 and have > started writing a small prototyp. How would you suggest I wait for the > workers > to finish before the Fetch method return? I could perhaps do as described > below > by calling Join on all worker threads or perhaps pass an > Auto/ManualResetEvent > to each worker and have them singnal completion and in the Fetch method I'd > call WaitHandler.WaitAll > > > > > "Stefan Simek" <nospam@nospam.nospam> skrev i meddelandet > news:OovxpulcFHA.1504@TK2MSFTNGP15.phx.gbl... > > Hi, > > > > I would recommend using your first scenario, as it is simple and > > straightforward. > > Using the threadpool would introduce additional complications due to the > > maximum limit on threadpool thread count, and would not really improve > > performance as we're talking about 5-15 second intervals. > > The async delegates are essentially only a wrapper around threadpool, so > > it's the same as above. > > > > HTH, > > Stefan > > > > Netveloper wrote: > >> Hi, > >> > >> In one of my classes I have a method, lets call it Fetch, which will > >> collect data > >> from various sources and return the combined result. Each of the sources > >> can > >> take between 5-15 seconds to collect so I would like to incrcease the > >> performance by introducing multi-threading support for the actuall > >> collecting of > >> data. So the Fetch method should spawn of the works and block until all > >> of the > >> workers has finished (or failed). > >> > >> I have done some lite reading and would like some feedback on what the > >> best > >> approch would be to implement this scenario. > >> > >> SCENARIO #1 - Using Thread > >> > >> I thought about creating a worker thread which will collect information > >> from a > >> source and return the result. This worker thread would have to be able to > >> take > >> two parameters (used to determine what data to get) and return an array > >> of > >> objects. > >> > >> The Fetch method could create a new worker object for each datasource, > >> pass > >> the correct (two) parameters for it, inform it about which callback to > >> use to signal > >> it's completion and pass back the return data to and then start it in a > >> new thread. > >> > >> Once all of the worker threads is running the Fetch method would enter > >> something > >> like this (note VB.NET as example, could just as well be C# since I code > >> in both) > >> > >> For Each WorkerThread In Workers > >> WorkerThread.Join() > >> Next > >> > >> Return CombinedResult > >> > >> Thoughts and/or suggestions? Advantages/Disadvantages? > >> > >> > >> SCENARIO #2 - ThreadPool > >> > >> Just like SCENARIO #1 but I would use the ThreadPool instead. How would I > >> wait for all the threads to finish before returning, i.e blocking the > >> Fetch method until > >> all workes has finished (or failed) ? > >> > >> Thoughts and/or suggestions? Advantages/Disadvantages? > >> > >> > >> SCENARIO #3 - Async Delegates > >> > >> I create a delegate which takes my worker process as a parameter. The > >> delegate is then > >> called using the async method, BeginInvoke and use an AsyncCallback to > >> gather and > >> combin the worker results. I would probably built this using the > >> technique posted by Mike > >> Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list > >> > >> (watch for line-wrapping) > >> http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-3&P=2534 > >> > >> to ensure EndInvoke was called, this avoiding a possible memory leak. If > >> I went down this > >> road, how would I make the Fetch method block until all of the async > >> operations had finished > >> (or failed) without having to resort to a busy-wait ? > >> > >> > >> All thoughts and suggestions will be apprechiated on this subject. > >> Thanks! > >> > >> > >> > > > >
[quoted text, click to view] Netveloper <noone@nowhere.com> wrote: > No pressure felt :) I've gotten both to work, equally well and I > would just like to understand the difference in approach. I guess > there are advantages/disadvantages with using either of the > approaches. Don't really like to use code without understanding > exaclty what it is doing ;)
Personally I'd use Join - no need to create any events you don't need, and it does exactly what it says on the tin. If you want to use a custom threadpool for this, by the way, you could use the one I've written: http://www.pobox.com/~skeet/csharp/miscutil You could subscribe to the event which is fired after a thread job has finished to synchronize the main thread. (Of course, you wouldn't be able to use Thread.Join in that scenario.) -- Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet
[quoted text, click to view] Andreas H=E5kansson <andreas@spamproof.selfinflicted.org> wrote: > Thanks for your feedback. I've been thinking about leveraging a > timeout so that the collecting of data wont block indefinitely. I saw > that both the Join and WaitAll methods accepted an optional timeout > parameter. >=20 > However the functionality provided by them aren't interchangable > since using a timeout the Join method will make the first thread run > a maximum time of x (the timeout), the next thread will run 2*x, the > next 3*x and so on. With WaitAll, all threads will get the same > change to execute before the method stops blocking the execution of > the main thread.
True. Note, however, that there is an alternative to using=20 Auto/ManualResetEvents - you can use Monitor.Wait and Monitor.Notify.=20 Personally, I prefer these - they feel more idiomatic .NET somehow,=20 rather than being Win32 shims. (They also perform very slightly better=20 if I remember rightly, but the difference isn't significant.) You could make each worker thread decrement a counter (which is set by=20 the main thread) and when the last worker thread decrements it to 0, it=20 could notify the monitor. =20 [quoted text, click to view] > The timeout, however, makes me wonder about the left worker threads. > They will continute executing in the background until they are > finished. Do I have to clean them up myself, if so then how? What > about, for example, if of the worker threads calls a webservice and > for some reason is unable to establish a connection, leaving it > waiting for it's own timeout which could have been increased beyond > the default time. This would leave the worker threads hanging around > for a long time even though the main thread timed out and continued > executing.. =3D/
See http://www.pobox.com/~skeet/csharp/threads/shutdown.shtml for=20 general guidance about stopping tasks in a controlled way. --=20 Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet
Jon, Thanks for your feedback. I've been thinking about leveraging a timeout so that the collecting of data wont block indefinitely. I saw that both the Join and WaitAll methods accepted an optional timeout parameter. However the functionality provided by them aren't interchangable since using a timeout the Join method will make the first thread run a maximum time of x (the timeout), the next thread will run 2*x, the next 3*x and so on. With WaitAll, all threads will get the same change to execute before the method stops blocking the execution of the main thread. The timeout, however, makes me wonder about the left worker threads. They will continute executing in the background until they are finished. Do I have to clean them up myself, if so then how? What about, for example, if of the worker threads calls a webservice and for some reason is unable to establish a connection, leaving it waiting for it's own timeout which could have been increased beyond the default time. This would leave the worker threads hanging around for a long time even though the main thread timed out and continued executing.. =/ [quoted text, click to view] "Jon Skeet [C# MVP]" <skeet@pobox.com> wrote in message news:MPG.1d1bbe66629cdffa98c31c@msnews.microsoft.com... > Netveloper <noone@nowhere.com> wrote: > > No pressure felt :) I've gotten both to work, equally well and I > > would just like to understand the difference in approach. I guess > > there are advantages/disadvantages with using either of the > > approaches. Don't really like to use code without understanding > > exaclty what it is doing ;) > > Personally I'd use Join - no need to create any events you don't need, > and it does exactly what it says on the tin. > > If you want to use a custom threadpool for this, by the way, you could > use the one I've written: > http://www.pobox.com/~skeet/csharp/miscutil > > You could subscribe to the event which is fired after a thread job has > finished to synchronize the main thread. (Of course, you wouldn't be > able to use Thread.Join in that scenario.) > > -- > Jon Skeet - <skeet@pobox.com> > http://www.pobox.com/~skeet > If replying to the group, please do not mail me too
John, Thanks for your feedback. Well lets see. The system is a multi cpu setup with ample amount of memory and a disc system with good throughput. The data sources are not located on the same machine, all are on remove web services and rdbms.When it comes to using async delegates I really wouldn't base my descision based on your arguments (this is not to say async delegates wouldn't be a good solution). The reasons being that collecting the return data by collecting the wait handlers and doing a WaitAll on them is not different from doing the same when manually spawning your own threads (with the help of Auto/ManualResetEvent objects), calling Join on each method, or like Jon suggested - using a Monitor. Also if you concider my breif description of the data collection, it will take between 5-15 (could take longer) seconds, averaging around 10 seconds. Now with this time fame in mind, the cost of spawning a new thread and any context switching that might take place every now and then, is faily cheep. If you don't concider the context, then sure thread creation and context switching are expensive operations. The default size of the thread pool is 25, and it's defined in the processModel node of machine.config. The pool is self is the mest intressting point for using either scenario 2 or three. There is no denying that using the pool to recycle threads will boost performance, how much is hard to tell since we're speaking in relative terms of the actuall collecting of data. If I have a need to create x-threads for each call to Fetch and there are y-calls to Fecth each second/minute then I might as well funnel them threw the pool. But.. the thread pool wouldn't be exclusive to my Fetch method, it would be shared for my application (which btw is a web-application) and if there are any async operations etc elsewhere then it will eat away on the pool - leaving for the possibility for the worker threads of the Fetch method to queue up and wait, resulting in a decrease in performance. Increasing the size of the thread pool could solve this. Sorry if I'm not very cohesive here, but I only got a couple of hours of sleep last night and I admit that I'm just ranting what ever thoughts spring into my head while replying to your post :-) "john conwell" <johnconwell@discussions.microsoft.com> skrev i meddelandet news:D0EC33A8-5332-4705-9D99-6BDD1DC95D8D@microsoft.com... [quoted text, click to view] > First, are the Fetch methods getting data on a different server? or the > server the app is running on? does the server that the threads are > running > under have multiple procs or just one. if its just one, then you are more > likely to slow your app down then speed it up. the same amount of > processing > has to get done, but now you are tossing on thread mgt and context > switching > into the mix. only do this if your app is distributed or the server is > multi-proc. > > As far as which way to go, I'm going to have to disagree. I'd go with > solution 3, async delegates. First async delegates have a simple way to > wait > for all threads to finish. just collect all the returned > IAsyncResult.AsyncWaitHandles into an array and call WaitHandle.WaitAll, > passing in the array. This will pause the main thread until all delegates > are finished running. doesnt get much easier. > > Also, as far as performance goes, the perf cost of initializing 5 - 10 new > manual threads is much more than utilizing the pre-existing threads > already > initialized in the thread pool. As far as a threadpool max count is > concerned, this shouldnt be an issue either. If you call > ThreadPool.GetMaxThreads you'll see how many threads can be created in the > pool. On my system its 100 (not sure if this is different per OS version > or > not). And if your plan on running more than 100 async tasks you should > rethink this also, as this would probably bog down the CPU with all the > processing and context switching. The threadpool can manage multiple > threads > quite well, and by the time you are ready to kick off your last thread, > the > first thread might be finished. in that case the thread pool will just > reuse > an existing thread instead of create another. > > Remember creating threads is a fairly significant performance hit. > > "Netveloper" wrote: > >> Hi, >> >> In one of my classes I have a method, lets call it Fetch, which will >> collect >> data >> from various sources and return the combined result. Each of the sources >> can >> take between 5-15 seconds to collect so I would like to incrcease the >> performance by introducing multi-threading support for the actuall >> collecting of >> data. So the Fetch method should spawn of the works and block until all >> of >> the >> workers has finished (or failed). >> >> I have done some lite reading and would like some feedback on what the >> best >> approch would be to implement this scenario. >> >> SCENARIO #1 - Using Thread >> >> I thought about creating a worker thread which will collect information >> from >> a >> source and return the result. This worker thread would have to be able to >> take >> two parameters (used to determine what data to get) and return an array >> of >> objects. >> >> The Fetch method could create a new worker object for each datasource, >> pass >> the correct (two) parameters for it, inform it about which callback to >> use >> to signal >> it's completion and pass back the return data to and then start it in a >> new >> thread. >> >> Once all of the worker threads is running the Fetch method would enter >> something >> like this (note VB.NET as example, could just as well be C# since I code >> in >> both) >> >> For Each WorkerThread In Workers >> WorkerThread.Join() >> Next >> >> Return CombinedResult >> >> Thoughts and/or suggestions? Advantages/Disadvantages? >> >> >> SCENARIO #2 - ThreadPool >> >> Just like SCENARIO #1 but I would use the ThreadPool instead. How would I >> wait for all the threads to finish before returning, i.e blocking the >> Fetch >> method until >> all workes has finished (or failed) ? >> >> Thoughts and/or suggestions? Advantages/Disadvantages? >> >> >> SCENARIO #3 - Async Delegates >> >> I create a delegate which takes my worker process as a parameter. The >> delegate is then >> called using the async method, BeginInvoke and use an AsyncCallback to >> gather and >> combin the worker results. I would probably built this using the >> technique >> posted by Mike >> Woodring, of DevelopMentor, on the Advanced-Dotnet mailing list >> >> (watch for line-wrapping) >> http://discuss.develop.com/archives/wa.exe?A2=ind0302B&L=ADVANCED-DOTNET&D=0&I=-3&P=2534 >> >> to ensure EndInvoke was called, this avoiding a possible memory leak. If >> I >> went down this
Oh, its a web app...That really makes a difference. From my experience you definitly dont want to use the treadpool then, because you would be stealing threads from your sites request handler, since it also uses the thread pool to service new http requests. I've played around with this a lot and its hard to find a good mix when each request could kick off multiple threads. Definitly use a web site load test tool (such as ACP) to prove if you actually sped things up or slowed them down. I had a site that for a speific request needed to get 7 result sets of data (from a web service). I tried many combinations of threading. One thread per result set, 2 result sets per thread, thread pool, manual threads. in the end with the site under moderate load, the fastest method was to do it synchrounously. these web service calls were pretty short, so under your situation you would get better results since each call takes 10 - 15 seconds. Another thing to consider is to create a custom IHttpHandler to intercept all calls to this page and kick off the threads in the ProcessRequest() method. Then forward the request on the to desired page to be processes. Then in that page sync back up with the threads using Join(). This way the threads can get some extra process time in before they have to sync back up. [quoted text, click to view] "Netveloper" wrote: > John, > > Thanks for your feedback. Well lets see. The system is a multi cpu setup > with ample > amount of memory and a disc system with good throughput. The data sources > are > not located on the same machine, all are on remove web services and > rdbms.When > it comes to using async delegates I really wouldn't base my descision based > on your > arguments (this is not to say async delegates wouldn't be a good solution). > > The reasons being that collecting the return data by collecting the wait > handlers and > doing a WaitAll on them is not different from doing the same when manually > spawning > your own threads (with the help of Auto/ManualResetEvent objects), calling > Join on > each method, or like Jon suggested - using a Monitor. > > Also if you concider my breif description of the data collection, it will > take between > 5-15 (could take longer) seconds, averaging around 10 seconds. Now with this > time > fame in mind, the cost of spawning a new thread and any context switching > that might > take place every now and then, is faily cheep. If you don't concider the > context, then > sure thread creation and context switching are expensive operations. > > The default size of the thread pool is 25, and it's defined in the > processModel node of > machine.config. The pool is self is the mest intressting point for using > either scenario 2 > or three. There is no denying that using the pool to recycle threads will > boost performance, > how much is hard to tell since we're speaking in relative terms of the > actuall collecting > of data. If I have a need to create x-threads for each call to Fetch and > there are y-calls > to Fecth each second/minute then I might as well funnel them threw the pool. > > But.. the thread pool wouldn't be exclusive to my Fetch method, it would be > shared for > my application (which btw is a web-application) and if there are any async > operations > etc elsewhere then it will eat away on the pool - leaving for the > possibility for the worker > threads of the Fetch method to queue up and wait, resulting in a decrease in > performance. > Increasing the size of the thread pool could solve this. > > Sorry if I'm not very cohesive here, but I only got a couple of hours of > sleep last night > and I admit that I'm just ranting what ever thoughts spring into my head > while replying to > your post :-) > > "john conwell" <johnconwell@discussions.microsoft.com> skrev i meddelandet > news:D0EC33A8-5332-4705-9D99-6BDD1DC95D8D@microsoft.com... > > First, are the Fetch methods getting data on a different server? or the > > server the app is running on? does the server that the threads are > > running > > under have multiple procs or just one. if its just one, then you are more > > likely to slow your app down then speed it up. the same amount of > > processing > > has to get done, but now you are tossing on thread mgt and context > > switching > > into the mix. only do this if your app is distributed or the server is > > multi-proc. > > > > As far as which way to go, I'm going to have to disagree. I'd go with > > solution 3, async delegates. First async delegates have a simple way to > > wait > > for all threads to finish. just collect all the returned > > IAsyncResult.AsyncWaitHandles into an array and call WaitHandle.WaitAll, > > passing in the array. This will pause the main thread until all delegates > > are finished running. doesnt get much easier. > > > > Also, as far as performance goes, the perf cost of initializing 5 - 10 new > > manual threads is much more than utilizing the pre-existing threads > > already > > initialized in the thread pool. As far as a threadpool max count is > > concerned, this shouldnt be an issue either. If you call > > ThreadPool.GetMaxThreads you'll see how many threads can be created in the > > pool. On my system its 100 (not sure if this is different per OS version > > or > > not). And if your plan on running more than 100 async tasks you should > > rethink this also, as this would probably bog down the CPU with all the > > processing and context switching. The threadpool can manage multiple > > threads > > quite well, and by the time you are ready to kick off your last thread, > > the > > first thread might be finished. in that case the thread pool will just > > reuse > > an existing thread instead of create another. > > > > Remember creating threads is a fairly significant performance hit. > > > > "Netveloper" wrote: > > > >> Hi, > >> > >> In one of my classes I have a method, lets call it Fetch, which will > >> collect > >> data > >> from various sources and return the combined result. Each of the sources > >> can > >> take between 5-15 seconds to collect so I would like to incrcease the > >> performance by introducing multi-threading support for the actuall > >> collecting of > >> data. So the Fetch method should spawn of the works and block until all > >> of > >> the > >> workers has finished (or failed). > >> > >> I have done some lite reading and would like some feedback on what the > >> best > >> approch would be to implement this scenario. > >> > >> SCENARIO #1 - Using Thread > >> > >> I thought about creating a worker thread which will collect information > >> from > >> a > >> source and return the result. This worker thread would have to be able to > >> take > >> two parameters (used to determine what data to get) and return an array > >> of
Don't see what you're looking for? Try a search.
|
|
|