2 sample t test for sample sizes

2 sample t test for sample sizes - 30,000 and 150,000

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ margin-bottom:0;
}

I have 2 samples, one with sample size of 30,000 customers and the other with 150,000. I have to perform a 2 sample t test(on conversion rates of the 2 groups). My question is, will t test in this case be biased towards the smaller sample? If yes, what is the correct approach to perform a test?

edited 6 hours ago

asked 6 hours ago

Shivam Tiwari

New contributor

6

$begingroup$
Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
$endgroup$
– StatsStudent
4 hours ago

add a comment |

edited 6 hours ago

asked 6 hours ago

Shivam Tiwari

New contributor

6

$begingroup$
Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
$endgroup$
– StatsStudent
4 hours ago

add a comment |

edited 6 hours ago

asked 6 hours ago

Shivam Tiwari

New contributor

hypothesis-testing statistical-significance t-test ab-test

edited 6 hours ago

asked 6 hours ago

Shivam Tiwari

New contributor

edited 6 hours ago

asked 6 hours ago

Shivam Tiwari

New contributor

edited 6 hours ago

asked 6 hours ago

Shivam Tiwari

New contributor

asked 6 hours ago

Shivam Tiwari

asked 6 hours ago

Shivam Tiwari

New contributor

Shivam Tiwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

6

$begingroup$
Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
$endgroup$
– StatsStudent
4 hours ago

add a comment |

6

$begingroup$
Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?
$endgroup$
– StatsStudent
4 hours ago

Samples of that size will almost certainly result in statistically significant findings, but the differences may not be of any practical significance. See here for another discussion about this: stats.stackexchange.com/questions/4075/…. What are the actual goals of your analysis too?

– StatsStudent
4 hours ago

add a comment |

2 Answers
2

active

oldest

votes

I can hardly imagine any worthwhile effect size that requires such a large sample size to be decently powered. There's no "bias" of having unequal sample sizes$^1$. The only disadvantage is that the power of the test tends to be somewhat limited by the smaller group. For even very small effects, 30,000 observations may confer quite a powerful test.

$^1$ except if you inappropriately use the "equal variance" assumption, in which case the "pooled variance" estimate is more heavily weighted toward the larger group (not toward the smaller as you suggested).

edited 4 hours ago

answered 6 hours ago

AdamO

35.2k265143

add a comment |

Maybe a couple of examples will help to illustrate some of the issues.

Suppose the two populations are $X sim mathsf{Norm}(mu = 500, sigma =30)$
and $Y sim mathsf{Norm}(mu = 501, sigma = 20.)$

If both sample sizes are $150,000,$ then there is sufficient power to detect
the small difference in means.

set.seed(422)

x = rnorm(150000, 500, 30)

y = rnorm(150000, 501, 20)

t.test(x, y)



        Welch Two Sample t-test



data:  x and y

t = -10.983, df = 261530, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.2042715 -0.8395487

sample estimates:

mean of x mean of y 

 499.9804  501.0023

If we use only the first 30,000 values in the first sample, results are
very nearly the same for most practical purposes.

t.test(x[1:30000], y)



        Welch Two Sample t-test



data:  x[1:30000] and y

t = -6.3728, df = 35463, p-value = 1.879e-10

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.5126269 -0.8010336

sample estimates:

mean of x mean of y 

 499.8455  501.0023

Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):

enter image description here

Issues of minimal concern:

Even though labeled as 'Welch t tests', sample sizes are sufficiently large
that these are essentially t tests. Unless the data are very far from normal,
we would still detect the small difference in means.

The power of the test is heavily dependent on the smaller sample size. But
power is not a concern here.

Issues warranting attention:

With such large samples
in the real world (not the simulation world),
one is entitled to wonder whether data are truly simple random samples from
their respective populations. Could smaller, more carefully collected samples provide better information?

Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
it is OK for variances to differ. But would different variances have important practical implications?

Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
taking the effort of check whether means are different? And what do the results
of the t test actually contribute to that purpose?

edited 4 hours ago

answered 5 hours ago

BruceET

7,1561721

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

Shivam Tiwari is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f404439%2f2-sample-t-test-for-sample-sizes-30-000-and-150-000%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

edited 4 hours ago

answered 6 hours ago

AdamO

35.2k265143

add a comment |

edited 4 hours ago

answered 6 hours ago

AdamO

35.2k265143

add a comment |

edited 4 hours ago

answered 6 hours ago

AdamO

35.2k265143

edited 4 hours ago

answered 6 hours ago

AdamO

35.2k265143

edited 4 hours ago

answered 6 hours ago

AdamO

35.2k265143

answered 6 hours ago

AdamO

35.2k265143

answered 6 hours ago

AdamO

35.2k265143

add a comment |

Maybe a couple of examples will help to illustrate some of the issues.

Suppose the two populations are $X sim mathsf{Norm}(mu = 500, sigma =30)$
and $Y sim mathsf{Norm}(mu = 501, sigma = 20.)$

If both sample sizes are $150,000,$ then there is sufficient power to detect
the small difference in means.

set.seed(422)

x = rnorm(150000, 500, 30)

y = rnorm(150000, 501, 20)

t.test(x, y)



        Welch Two Sample t-test



data:  x and y

t = -10.983, df = 261530, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.2042715 -0.8395487

sample estimates:

mean of x mean of y 

 499.9804  501.0023

If we use only the first 30,000 values in the first sample, results are
very nearly the same for most practical purposes.

t.test(x[1:30000], y)



        Welch Two Sample t-test



data:  x[1:30000] and y

t = -6.3728, df = 35463, p-value = 1.879e-10

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.5126269 -0.8010336

sample estimates:

mean of x mean of y 

 499.8455  501.0023

Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):

enter image description here

Issues of minimal concern:

Even though labeled as 'Welch t tests', sample sizes are sufficiently large
that these are essentially t tests. Unless the data are very far from normal,
we would still detect the small difference in means.

The power of the test is heavily dependent on the smaller sample size. But
power is not a concern here.

Issues warranting attention:

With such large samples
in the real world (not the simulation world),
one is entitled to wonder whether data are truly simple random samples from
their respective populations. Could smaller, more carefully collected samples provide better information?

Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
it is OK for variances to differ. But would different variances have important practical implications?

Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
taking the effort of check whether means are different? And what do the results
of the t test actually contribute to that purpose?

edited 4 hours ago

answered 5 hours ago

BruceET

7,1561721

add a comment |

Maybe a couple of examples will help to illustrate some of the issues.

Suppose the two populations are $X sim mathsf{Norm}(mu = 500, sigma =30)$
and $Y sim mathsf{Norm}(mu = 501, sigma = 20.)$

If both sample sizes are $150,000,$ then there is sufficient power to detect
the small difference in means.

set.seed(422)

x = rnorm(150000, 500, 30)

y = rnorm(150000, 501, 20)

t.test(x, y)



        Welch Two Sample t-test



data:  x and y

t = -10.983, df = 261530, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.2042715 -0.8395487

sample estimates:

mean of x mean of y 

 499.9804  501.0023

If we use only the first 30,000 values in the first sample, results are
very nearly the same for most practical purposes.

t.test(x[1:30000], y)



        Welch Two Sample t-test



data:  x[1:30000] and y

t = -6.3728, df = 35463, p-value = 1.879e-10

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.5126269 -0.8010336

sample estimates:

mean of x mean of y 

 499.8455  501.0023

Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):

enter image description here

Issues of minimal concern:

Even though labeled as 'Welch t tests', sample sizes are sufficiently large
that these are essentially t tests. Unless the data are very far from normal,
we would still detect the small difference in means.

The power of the test is heavily dependent on the smaller sample size. But
power is not a concern here.

Issues warranting attention:

With such large samples
in the real world (not the simulation world),
one is entitled to wonder whether data are truly simple random samples from
their respective populations. Could smaller, more carefully collected samples provide better information?

Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
it is OK for variances to differ. But would different variances have important practical implications?

Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
taking the effort of check whether means are different? And what do the results
of the t test actually contribute to that purpose?

edited 4 hours ago

answered 5 hours ago

BruceET

7,1561721

add a comment |

Maybe a couple of examples will help to illustrate some of the issues.

Suppose the two populations are $X sim mathsf{Norm}(mu = 500, sigma =30)$
and $Y sim mathsf{Norm}(mu = 501, sigma = 20.)$

If both sample sizes are $150,000,$ then there is sufficient power to detect
the small difference in means.

set.seed(422)

x = rnorm(150000, 500, 30)

y = rnorm(150000, 501, 20)

t.test(x, y)



        Welch Two Sample t-test



data:  x and y

t = -10.983, df = 261530, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.2042715 -0.8395487

sample estimates:

mean of x mean of y 

 499.9804  501.0023

If we use only the first 30,000 values in the first sample, results are
very nearly the same for most practical purposes.

t.test(x[1:30000], y)



        Welch Two Sample t-test



data:  x[1:30000] and y

t = -6.3728, df = 35463, p-value = 1.879e-10

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.5126269 -0.8010336

sample estimates:

mean of x mean of y 

 499.8455  501.0023

Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):

enter image description here

Issues of minimal concern:

Even though labeled as 'Welch t tests', sample sizes are sufficiently large
that these are essentially t tests. Unless the data are very far from normal,
we would still detect the small difference in means.

The power of the test is heavily dependent on the smaller sample size. But
power is not a concern here.

Issues warranting attention:

With such large samples
in the real world (not the simulation world),
one is entitled to wonder whether data are truly simple random samples from
their respective populations. Could smaller, more carefully collected samples provide better information?

Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
it is OK for variances to differ. But would different variances have important practical implications?

Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
taking the effort of check whether means are different? And what do the results
of the t test actually contribute to that purpose?

edited 4 hours ago

answered 5 hours ago

BruceET

7,1561721

Maybe a couple of examples will help to illustrate some of the issues.

Suppose the two populations are $X sim mathsf{Norm}(mu = 500, sigma =30)$
and $Y sim mathsf{Norm}(mu = 501, sigma = 20.)$

If both sample sizes are $150,000,$ then there is sufficient power to detect
the small difference in means.

set.seed(422)

x = rnorm(150000, 500, 30)

y = rnorm(150000, 501, 20)

t.test(x, y)



        Welch Two Sample t-test



data:  x and y

t = -10.983, df = 261530, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.2042715 -0.8395487

sample estimates:

mean of x mean of y 

 499.9804  501.0023

If we use only the first 30,000 values in the first sample, results are
very nearly the same for most practical purposes.

t.test(x[1:30000], y)



        Welch Two Sample t-test



data:  x[1:30000] and y

t = -6.3728, df = 35463, p-value = 1.879e-10

alternative hypothesis: true difference in means is not equal to 0

95 percent confidence interval:

 -1.5126269 -0.8010336

sample estimates:

mean of x mean of y 

 499.8455  501.0023

Here is a boxplot of the data used in the second t test (the wider box indicates a larger sample):

enter image description here

Issues of minimal concern:

Even though labeled as 'Welch t tests', sample sizes are sufficiently large
that these are essentially t tests. Unless the data are very far from normal,
we would still detect the small difference in means.

The power of the test is heavily dependent on the smaller sample size. But
power is not a concern here.

Issues warranting attention:

With such large samples
in the real world (not the simulation world),
one is entitled to wonder whether data are truly simple random samples from
their respective populations. Could smaller, more carefully collected samples provide better information?

Although we did not do a formal test to confirm that variances differ, it seems clear from the boxplot that they do. In the Welch test,
it is OK for variances to differ. But would different variances have important practical implications?

Although the null hypothesis that the two population means are equal is soundly rejected with minuscule P-values, it is important to realize that "statistically significant" differences (by whatever definition) are not necessarily differences of practical importance or interest. For what purpose are you
taking the effort of check whether means are different? And what do the results
of the t test actually contribute to that purpose?

edited 4 hours ago

answered 5 hours ago

BruceET

7,1561721

edited 4 hours ago

answered 5 hours ago

BruceET

7,1561721

answered 5 hours ago

BruceET

7,1561721

answered 5 hours ago

BruceET

7,1561721

add a comment |

Shivam Tiwari is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

Shivam Tiwari is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Cross Validated!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cfxtrjtrk