Alternative heavy right tailed distribution to exponential distributionGamma and exponential distributionsExponential Distribution Expected ValueSelf Study Qn on Exponential DistributionFitting an exponential distribution to data and finding third quartile problemTranslate exponential distribution into normal distributionHow is the tail of a distribution defined (about heavy-tailed distributions)?Interarrival times of exponential distributionGenerate random numbers from a power-law/exponential distributionIs the truncated power law a heavy-tailed distribution?Can I use the exponential distribution to model data with some negative values?
Why does overlay work only on the first tcolorbox?
If I can solve Sudoku, can I solve the Travelling Salesman Problem (TSP)? If so, how?
How do I change two letters closest to a string and one letter immediately after a string using Notepad++?
What is the significance behind "40 days" that often appears in the Bible?
Simplify an interface for flexibly applying rules to periods of time
Bacteria contamination inside a thermos bottle
Why Choose Less Effective Armour Types?
Are all passive ability checks floors for active ability checks?
Bach's Toccata and Fugue in D minor breaks the "no parallel octaves" rule?
Book about superhumans hiding among normal humans
How to write cleanly even if my character uses expletive language?
What is a ^ b and (a & b) << 1?
Why does a Star of David appear at a rally with Francisco Franco?
Does this sum go infinity?
Life insurance that covers only simultaneous/dual deaths
A single argument pattern definition applies to multiple-argument patterns?
What did “the good wine” (τὸν καλὸν οἶνον) mean in John 2:10?
Have the tides ever turned twice on any open problem?
Python if-else code style for reduced code for rounding floats
What is "focus distance lower/upper" and how is it different from depth of field?
Is it insecure to send a password in a `curl` command?
Are relativity and doppler effect related?
Employee lack of ownership
Problem with FindRoot
Alternative heavy right tailed distribution to exponential distribution
Gamma and exponential distributionsExponential Distribution Expected ValueSelf Study Qn on Exponential DistributionFitting an exponential distribution to data and finding third quartile problemTranslate exponential distribution into normal distributionHow is the tail of a distribution defined (about heavy-tailed distributions)?Interarrival times of exponential distributionGenerate random numbers from a power-law/exponential distributionIs the truncated power law a heavy-tailed distribution?Can I use the exponential distribution to model data with some negative values?
$begingroup$
I have data whose distribution resembles an exponential distribution, but the data has a heavier tail than the exponential distribution.
I will be very glad for any recommendation of an alternative to the exponential distribution for the data.
exponential
$endgroup$
add a comment |
$begingroup$
I have data whose distribution resembles an exponential distribution, but the data has a heavier tail than the exponential distribution.
I will be very glad for any recommendation of an alternative to the exponential distribution for the data.
exponential
$endgroup$
2
$begingroup$
A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
$endgroup$
– whuber♦
3 hours ago
$begingroup$
@whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
$endgroup$
– oercim
3 hours ago
$begingroup$
One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
$endgroup$
– whuber♦
2 hours ago
add a comment |
$begingroup$
I have data whose distribution resembles an exponential distribution, but the data has a heavier tail than the exponential distribution.
I will be very glad for any recommendation of an alternative to the exponential distribution for the data.
exponential
$endgroup$
I have data whose distribution resembles an exponential distribution, but the data has a heavier tail than the exponential distribution.
I will be very glad for any recommendation of an alternative to the exponential distribution for the data.
exponential
exponential
edited 3 hours ago
Nick Cox
39k587130
39k587130
asked 4 hours ago
oercimoercim
284110
284110
2
$begingroup$
A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
$endgroup$
– whuber♦
3 hours ago
$begingroup$
@whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
$endgroup$
– oercim
3 hours ago
$begingroup$
One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
$endgroup$
– whuber♦
2 hours ago
add a comment |
2
$begingroup$
A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
$endgroup$
– whuber♦
3 hours ago
$begingroup$
@whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
$endgroup$
– oercim
3 hours ago
$begingroup$
One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
$endgroup$
– whuber♦
2 hours ago
2
2
$begingroup$
A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
$endgroup$
– whuber♦
3 hours ago
$begingroup$
A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
$endgroup$
– whuber♦
3 hours ago
$begingroup$
@whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
$endgroup$
– oercim
3 hours ago
$begingroup$
@whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
$endgroup$
– oercim
3 hours ago
$begingroup$
One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
$endgroup$
– whuber♦
2 hours ago
$begingroup$
One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
$endgroup$
– whuber♦
2 hours ago
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Given your discussion with @whuber, I would suggest two approaches.
(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.
(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.
Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).
Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.
$endgroup$
$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "65"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f397893%2falternative-heavy-right-tailed-distribution-to-exponential-distribution%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Given your discussion with @whuber, I would suggest two approaches.
(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.
(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.
Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).
Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.
$endgroup$
$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago
add a comment |
$begingroup$
Given your discussion with @whuber, I would suggest two approaches.
(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.
(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.
Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).
Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.
$endgroup$
$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago
add a comment |
$begingroup$
Given your discussion with @whuber, I would suggest two approaches.
(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.
(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.
Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).
Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.
$endgroup$
Given your discussion with @whuber, I would suggest two approaches.
(1) A mixture model, perhaps a mixture of exponentials for simplicity. One can think of these as a heirarchical model, where observations can come from different sub-populations and each sub-population has its own distribution. From what you've described, this sounds like the situation you are looking at; most people who get in an accident report it almost immediately to an insurance provider. Those who wait any significant amount of time are likely following very different patterns.
(2) Kaplan-Meier curves. This is a non-parameteric approach that makes no assumptions about the baseline distributions. This is a very simplified approach that tells you what the data says, not what a constrained model of the data says.
Whether you want to use (1) or (2) depends on your use case; are you interested in knowing the proportion of subjects who don't follow the trend of reporting very quickly? Then (1) answers this question a little more directly (assuming a good fit). Are you just interested in seeing the overall distribution of time to reporting without a model's interpretation of the data? Then use (2).
Also, my advice is that even if you decide to use (1), you should still compare the overall fit with (2) as a form of model checking.
answered 2 hours ago
Cliff ABCliff AB
13.5k12567
13.5k12567
$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago
add a comment |
$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago
$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago
$begingroup$
thanks a lot for the answer. My aim is making some simulations about the duration. I want to generate random data, for the different parameters of the underlying distribution. I guess, (1) seems more adequate for my aim. But,fitting such a distribution(s) may be hard. I will try for it.
$endgroup$
– oercim
2 hours ago
add a comment |
Thanks for contributing an answer to Cross Validated!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f397893%2falternative-heavy-right-tailed-distribution-to-exponential-distribution%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
var $window = $(window),
onScroll = function(e)
var $elem = $('.new-login-left'),
docViewTop = $window.scrollTop(),
docViewBottom = docViewTop + $window.height(),
elemTop = $elem.offset().top,
elemBottom = elemTop + $elem.height();
if ((docViewTop elemBottom))
StackExchange.using('gps', function() StackExchange.gps.track('embedded_signup_form.view', location: 'question_page' ); );
$window.unbind('scroll', onScroll);
;
$window.on('scroll', onScroll);
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
$begingroup$
A good basis for any such recommendation is a theory about the underlying process, because tail estimation can be highly uncertain without having a large amount of data. What additional information can you supply about your problem?
$endgroup$
– whuber♦
3 hours ago
$begingroup$
@whuber , data is about the time duration(in months) between car accidents day and reporting day of accidents. But I dont have so much data. It seems that, most of the accidents are generally being reported quickly, but some of the accidents are being reported lately. when, I fit the data with an exponantial distribution, I see that the tail seems so light compared with the data.
$endgroup$
– oercim
3 hours ago
$begingroup$
One might suppose there is a statute of limitations or insurance limit, most likely a whole number of years. This wouldn't be modeled well by any standard heavy-tailed distribution. You ought to closely examine what data you do have.
$endgroup$
– whuber♦
2 hours ago