[quoted text, click to view] Xiuming wrote:
> BTW, I missed some information in my post.
> When I tested with a hashtable object, I called the Clear() method of
> the hashtable instance at the end of Test1() method.
> After the Clear() call, references to those string object should be
> counted down to 0 I think,
The CLR GC currently uses no reference counting for local in-process
objects, AFAIK. There may be special functionality for COM or remoting,
but I don't know the details for those.
[quoted text, click to view] > and all those string objects should stay in
> gen0 and should be collected for a while.
Take a reference to one of the first strings to you add to the
dictionary and check its generation after the end of the loop, using
GC.GetGeneration, to confirm your theory that it is in gen0. I think you
will find differently, if you do add 10 million unique strings to the
hash table. That would have caused at least one GC, I expect.
[quoted text, click to view] > But when I check the memory
> used by the process after running for 1 hour, it still remains the
> same.
GC only occurs when memory is being allocated. The GC does nothing if
you are not allocating memory.
First, the GC tries to allocate from gen0. If gen0 has not enough memory
free, the GC collects gen0. This causes memory for live (== rooted)
objects to be copied to gen1, and gen0 becomes clean again; later
allocations will come from gen0.
However, if collecting gen0 didn't free much memory, then the GC will
(probably) collect gen1 (excluding recently copied objects from gen0).
Any memory for live objects in gen1 will be copied to gen2, and gen1
becomes clean again; later collections of gen0 will copy to gen1.
If collecting gen1 didn't free much memory, then the GC will (probably)
collect gen2 (excluding recently copied objects from gen1). Any free
space created in gen2 will be filled up by compacting the heap, that is,
moving the memory for live objects so that they are all contiguous in
memory.
So, if you have an object which has survived many GCs (such as a hash
table that was added to in a big loop, like 10,000,000 items), then all
the memory for it is probably in gen2. In order for it to be freed up by
the GC, a gen2 collection will need to occur. That will need a gen1
collection to have occurred (and failed to free much memory), which in
turn will need a gen0 to have occurred (and failed to free much memory).
So, in order for the hash table data to be freed, you will need to do
more allocations, maybe a lot more. This is good for real applications
because it means the GC doesn't bother collecting until you ask it for
more memory. It is bad for benchmarks, because it seems like the GC
doesn't do much work. You can force a GC, by calling
GC.Collect(generation-number). You can find out which generation an
object is in by calling GC.GetGeneration().
The "probably" in the description above is because the GC may choose to
collect or not to collect based on the historical running of the
program. The idea is that it can tune itself to the running
characteristics of the program.
For example, if it's a batch-mode application, then the most efficient
way is to let loads of memory be used up, then freed all in one go,
ideally by application exit (when GC is free); if it's a long-running
iterative batch-mode application, then peak performance will
theoretically occur when gen0 is allowed to get very large.
Alternatively, if it's a server application with lots of little
requests, then gen0 can be quite small, ideally equivalent to the
maximum working set of a server request. If this were true, then gen1
would only hold objects that were alive during a gen0 collection, but
next time a gen1 collection occurs everything in gen1 dies.
Above all, the GC hopes that it never has to collect gen2, because
collecting gen2 is very slow. The worst pattern for a generational GC to
deal with is an object which gets built up to a large size, so it
survives a few GCs, and then dies. Generation GCs assume that young
objects die fast, and old objects ideally never die.
Of course, if you continue to make allocations (which is the only way
the GC will run by itself), gen2 will eventually get collected.
Otherwise, much memory - in particular, objects allocated from the Large
Object Heap (>80KB in size) would never get collected.
-- Barry
--