Click here to Skip to main content
15,899,124 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
All,

many thanks for reading my problem!

I have a dataset that tells me each month how many new clients I have won, and how many I have lost. I want to estimate average client lifetime from this data-set.

To expand a little on my issue:

I could limit the dataset to those clients who have come and gone, for whom I have a start date and an end date. It is then trivial to take an average lifetime. But what about those loyal clients who haven't left yet! They are not affecting this lifetime calculation, so I am probably underestimating.

Put another way, if two clients joined me a year ago. One left promptly after one month but the other has remained with me. I'm not happy with an algorithm that tells me the average lifetime is one month. But how do I treat the other client?

Thanks!

Jon

What I have tried:

My best idea is to take the number of losses in a time-period (probably monthly) and divide that by the total number of clients in that month. I can apply a moving average or some-such algorithm to this measure to refine it more. But what I don't have is a clear justification for this approach! (But at least it's simple to code up!)

Other things I've tried fall foul when my sales are not constant. I have approaches that work if sales numbers are constant or the client numbers are roughly constant, but fall over in the cases of high growth or decline.
Posted
Updated 13-Apr-16 0:31am

1 solution

First, I'd pool the two client types separately: those who've a fixed lifetime value and those who's lifetime as a client is still open-ended. They're really not compatible data in any good way. A new client for a month, averaged with one of ten years gives a poor result, even if both are still clients. You need to make an assumption about the non-existent closing dates of clients that may never close their account.

You've not given a language for this, so I'll offer two options for the loyal customers.

If using SQL, when you request the closing date, use

ISNULL('closing date field name', GETDATE())

in your select when retrieving the date (by whatever means you determine the date difference) and you can now average the life that includes the missing values.
OR
If doing this with the data set after it is retrieved from wherever you have it, then you need to do a similar replacement in the array for elements without a date.

Now, as in the first paragraph, I don't see any valid way of (1) mixing the two types of clients, and (2) I don't see a legitimate way of averaging values that are actually open-ended in their magnitude.

Best hope, perhaps, is a histogram display, particularly for the open-ended group. I can visualize that one can compare the two flavors of histogram you could generate and come up with some conclusions.
 
Share this answer
 
v2
Comments
JonPoley3 15-Apr-16 5:48am    
Thank you Balboos!

I've been pointed in the direction of cricketing batting averages which suffer an analogous problem. i.e. if a batsmen is not out how can this innings get added to his average? Seems its been an unsolved problem there for a long time too!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900