Wednesday, September 12, 2012

Fixing an Android Memory Leak

One of the most dreaded bugs in Android is a memory leak. They are nasty because one piece of code causes an issue and in some other piece of code, your application crashes. The general rules of causality are violated. My adventure begins in the Android Developer console's Application Error Reports section, there I found several crash reports labelled, OutOfMemoryError.

Out of memory errors can be divided into two basic type: type one is the obvious - you are actually out of memory. You are trying to allocate more memory than your app's heap has. Type two is a memory leak, this type is harder to find. Looking at the crash reports it was easy to tell that I had a leak. The first clue is that the crashes are not coming from one place but from several. This is part of what makes memory leaks tough.

At this point I decided to do a low level check for the issue. I hooked up a phone to my PC and ran DDMS. Android's garbage collector writes information to LogCat as it is running. This information is short but sweet.

GC_CONCURRENT freed 1837K, 41% free 6232K/10503K, external 6427K/7571K, paused 3ms+4ms
GC_FOR_MALLOC freed 486K, 41% free 6372K/10695K, external 6764K/7459K, paused 47ms
GC_EXTERNAL_ALLOC freed 1147K, 41% free 6414K/10823K, external 7002K/7459K, paused 60ms

The most important piece information above is the highlighted percentage free. This number will fluctuate as you use your application but it should hover around certain values. You can never use 100% of the memory. Below 30% the system will begin to respond sluggishly and the time needed to allocate memory will go from less than 10ms to above 100ms. Below 20% and you will start seeing messages like:

Clamp target GC heap from 32.222MB to 32.000MB
GC_FOR_MALLOC freed 0K, 18% free 19695K/24007K, external 7950K/8765K, paused 171ms

These messages are an indication that the system is getting ready to crash.

As I stepped through the app, everything looked fine. I would watch the memory chunk downward and just when I thought I had found the culprit, it would suddenly shoot back up. Then I realized what I might be doing wrong. We have an active base of over 500,000 users. In the field, users are not doing a tour of all of the apps features as I was doing. If they were we would see a lot more than the average of six reports per week over this issue. Instead, I reasoned this was probably more a case of a user doing something excessively, such as navigating between two pages or repeatedly tapping a button.

I knew that the app is usually trying to allocate memory for bitmaps from the crash report. I knew the area of the app where the most bitmaps are allocated. And I reasoned the type of behavior the user might be doing before the crash. Now all I needed was a bit of luck.

Eureka

One section of our app consist of about eight pages hooked up to a ViewFlipper. On the last page of the flipper, the user is taken back to the first page. It is a loop and it goes in both directions. What would happen if the user kept looping through the pages, I wondered? The answer: memory would slowly decrease. I almost didn't notice it, but each loop through the pages would decrement a percent or so from the free memory. Now I was sure I had a memory leak but I still didn't know the cause.

(initial)
GC_CONCURRENT freed 1803K, 43% free 5747K/9991K, external 6289K/7624K, paused 3ms+4ms
(first pass)
GC_CONCURRENT freed 1486K, 32% free 8375K/12295K, external 12760K/12988K, paused 3ms+5ms
(second pass)
GC_EXTERNAL_ALLOC freed 919K, 30% free 10014K/14279K, external 11183K/11636K, paused 93ms
(fourth pass)
GC_EXTERNAL_ALLOC freed 6K, 29% free 11111K/15559K, external 9558K/10063K, paused 94ms

DDMS and the Eclipse Memory Analyzer Tool

The DDMS has some built-in and rudimentary heap tools, but when paired with the free Eclipse Memory Analyzer Tool or MAT, they shine. MAT can take the heap dumps that DDMS creates and fully analyze them. It does way more things than I have the time to learn, but I did figure out enough to have it help me find my issue. All I needed to do is generate the Leak Suspects report.

Note: Memory profiling in the DDMS is very resource intensive, so much so that I would only recommend doing it on a device and never on the emulator.

To generate the Leak Suspects Report:
  • Once you have the app in the area of the suspected memory leak, click the Update Heap icon
  • Do some activity to cause some memory to leak
  • Click the heap dump icon
  • On the Getting Started Wizard, click Leak Suspects Report, then click Finish.
  • After a delay, the Leak Suspects report will appear




The Leak Suspects Report

The report will generate a graph indicating things it thinks may be memory leaks. Be careful here. Not everything it thinks may be a leak actually is. Remember your program needs to have some objects allocated in order to function. What you really need to be suspicious of is things that have multiple instances allocated or things which are using huge amounts of memory. In my case, it was problem suspect #2. It showed 10 instances of the Dealer object allocated. As I looped through the pages the number of these objects grew and so did their size. But why? The Dealer object was a very simple class. It held the name, address, telephone number, etc. of a dealer. Yet according to MAT it had a shallow heap size of 64 bytes, but a retained heap size of 1.2 MB - what?

Leak Suspects

Leak Suspects

It is All in the Details

In order to determine why a simple object, which should be no more that a few hundred bytes was hanging onto 1.2 MB, I clicked on the Details link. This causes three more reports to appear: Shortest Paths To the Accumulation Point, Accumulated Objects, Accumulated Objects by Class.

Each of these reports told me lots of information. The Shortest Paths To the Accumulation Point report show me that my simple Dealer object was holding onto a lot of stuff including an activity. This is very common cause of memory leaks in Android. An object is somehow holding onto a reference to an activity causing it to not be garbage collected. The Accumulated Objects report showed me that there were Linear Layouts involved in this too. And finally the Accumulated Objects by Class report told me the final key pieces of information I needed. There were TextViews and DealerInfoView objects involved too.

Shortest Paths To the Accumulation Point


Accumulated Objects

Accumulated Objects by Class

The DealerInfoView object is only new'ed up in one place, in the adapter for the Dealer ListView. Once I looked at the getView() method I saw the problem.

@Override
public View getView(int position, View convertView, ViewGroup parent) {
 View dealersRow = inflater.inflate(R.layout.carhubdealersrow, null);
 Dealer dealer = dealers.get(position);
 new DealerInfoView(dealer, dealersRow);
 Button getQuote = (Button) dealersRow.findViewById(R.id.ButtonGetQuote);
 getQuote.setOnClickListener(new DealerQuoteDialog(activity, dealer, getFromHub()));
 ImageButton buttonDirections = (ImageButton) dealersRow.findViewById(R.id.ImageButtonDirections);
 buttonDirections.setTag(dealer);
 buttonDirections.setOnClickListener(new Directions(activity));
 return dealersRow;
}

The dealers object is an ArrayList of Dealer objects. It exist outside of this adapter. The adapter is managed by Android. So when the garbage collector tries to release this view it can't because it doesn't control the dealer object. The solution to this issue was relatively simple, create a local copy of the dealer object. The changed method is below. Only one small change was necessary to fix a painful bug.

@Override
public View getView(int position, View convertView, ViewGroup parent) {
 View dealersRow = inflater.inflate(R.layout.carhubdealersrow, null);
 Dealer dealer = new Dealer(dealers.get(position));
 new DealerInfoView(dealer, dealersRow);
 Button getQuote = (Button) dealersRow.findViewById(R.id.ButtonGetQuote);
 getQuote.setOnClickListener(new DealerQuoteDialog(activity, dealer, getFromHub()));
 ImageButton buttonDirections = (ImageButton) dealersRow.findViewById(R.id.ImageButtonDirections);
 buttonDirections.setTag(dealer);
 buttonDirections.setOnClickListener(new Directions(activity));
 return dealersRow;
}

Because the Dealer object is no longer a pointer to a foreign object, the garbage collector is now free to collect it. The memory leak is now plugged.

13 comments:

  1. This is a really amazing story from a real world application. Thank you for posting it.

    ReplyDelete
    Replies
    1. Thanks for your support. More True Android Tales coming soon.

      Delete
  2. Good writeup, useful technique. Thanks Troy!

    ReplyDelete
    Replies
    1. Thanks for your support. If you like the post, please +1 it.

      Delete
  3. Great post, Troy. Really enjoyed it. I have a favor to ask of you--I'm working on a tool to statically detect Android memory leaks like the one you describe. It sounds like this leak would be a great real-world stress test for the tool and I'd love to see if it can find it. Is there any way you could share your code (source or .class files are both fine) so that I could try out the tool on it?

    Thanks again for the nice narrative!

    ReplyDelete
    Replies
    1. Hi Sam,

      Thanks for reading my post. Unfortunately the code belongs to my employer and there is no way I could release any part of it. Sorry.

      Troy

      Delete
  4. Hey great post. But wouldnt you just be able to set the dealer object as null right before you return the view?

    ReplyDelete
  5. Excellent write up on a very complex matter. Could you post some more examples on this as it would really help a lot of people out here.
    Thanks...

    ReplyDelete
    Replies
    1. Hi Anshu,
      That is a great suggestion. As soon as my schedule clears up I will try to create a sample app which has some memory leaks in it. That way I can completely share the code with everyone.

      Troy

      Delete
  6. "owns it". No clue what that means. This is what I kinda-dislike about Android programming. Its way too hard. I feel like a battered-programmer..... Why don't you fix my bug/leak/crashy/sluggish behavior here:

    http://sourceforge.net/projects/zombiespeedlimitlogger/

    Thanks for posting!

    ReplyDelete