Mikko Rönkkö
Mikko Rönkkö
  • Видео 268
  • Просмотров 991 964
AoM Meeting 2024 symposium: Precision in Data Preparation
Academy of Management Meeting 2024 symposium "Precision in Data Preparation: Navigating Methodologies & Statistical Practices for Robust Analysis"
The symposium focuses on key topics related to data preparation, such as data screening, outlier management, data transformation, careless responding, addressing common method variance, and dealing with data falsification. Each subject matter expert will shed light on the significance and nuances of these areas, delving into best practices, prevalent errors, and the pros and cons of different approaches. They will also offer actionable insights for practitioners and scholars in the field. The session will culminate with an interactive Q&A segmen...
Просмотров: 157

Видео

AOM 2024 Moderation PDW with Jeremy Dawson (partial recording)
Просмотров 143День назад
Live recording of the Academy of Management Meeting 2024 Moderation Professional Development workshop. The recording covers only the first 1:10 of the workshop because the camera battery died. The audience wanted the workshop still posted. Link to materials: www.jeremydawson.com/pdw.htm including slides, datasets, R, Stata, and SPSS syntax. Link to online plotting tool: www.shinyapps.io/admin/#...
Normality assumption
Просмотров 376День назад
I have often been asked to explain the normal distribution assumption to students working on master theses. The normal distribution assumption is widely misunderstood, and its significance is often taught incorrectly. When comparing averages, the normal distribution has no relevance to the validity of the test. The non-normality of a variable also doesn't mean that the variable should be transf...
Normaalijakaumaoletus (in Finnish)
Просмотров 12114 дней назад
Minulta on usein pyydetty että selittäisin normaalijakaumaoletuksen graduntekijöille ymmärrettävällä tasolla. Normaalijakaumaoletus on laajalti väärinymmärretty ja sen merkitys opetetaan usein väärin. Keskiarvoja verrattaessa ei normaalijakaumalla ole mitään väliä testin oikeellisuuden kannalta. Muuttujan ei-normaalius ei myöskään tarkoita että muuttuja pitäisi esimerkiksi muuntaa logaritmilla ...
Simulations in Stata
Просмотров 6055 месяцев назад
This screencast explains how to do statistical simulations in Stata. The Stata file can be found at osf.io/w5z7f
Simulations in R
Просмотров 6155 месяцев назад
This screencast explains how to do statistical simulations in R. The R file can be found at osf.io/5uep6
Reading research articles with Microsoft Edge and Copilot
Просмотров 1,3 тыс.8 месяцев назад
This is a quick screencast that shows how you can use Microsoft Copilot in the Microsoft Edge browser to access GPT-4 for free and use GPT to help you understand research articles.
Metodifestivaalit 2023 Confirmatory Factor Analysis talk
Просмотров 56011 месяцев назад
A live recording from Metodifestivaalit 2023 conference. I was asked to talk about confirmatory factor analysis. I give a brief conceptual introduction to the topic after which I briefly explain my diagnostics workflow with an empirical example. Link to slides: osf.io/7bh5p
AoM 2023: Everything You Wanted to Know About Moderated Regression (But Were Afraid to Ask)
Просмотров 845Год назад
Get slides and materials at www.jeremydawson.com/pdw.htm This is a live recording of a professional development workshop on mediation, given by Jeremy Dawson and Mikko Rönkkö at the Academy of Management Meeting 2023 in Boston. Although the testing of interaction effects via moderated regression is commonplace in management research, many misunderstandings and gaps in knowledge persist, and kno...
GPT in higher education
Просмотров 1,4 тыс.Год назад
I was involved in creating the Jyväskylä University School of Business and Economics (JSBE) policy on the use of GPT and other language models in teaching. In the video, I talk about GPT and present four ways it could be used in higher education. Finally I tell about the JSBE policy and why we decided to have each specific guide line. The four use cases: 1) Cheating by writing entire essay assi...
Using AI to understand articles
Просмотров 2,7 тыс.Год назад
In the past, I taught students that when they read an article on a topic they are not familiar with, they should start by finding the terms and definitions. In this video, I show another approach where I use ChatGPT to explain the article to me. This can provide a shortcut to understanding articles, but is not a substitute for reading the article because the AI sometimes goes horribly wrong.
GPT yliopistossa
Просмотров 988Год назад
Olin mukana tekemässä Jyväskylän yliopiston kauppakorkeakoulun (JSBE) politiikkaa GPT:n ja muiden kielimallien käytöstä yliopiston opetuksessa. Esittelen videossa GPT:n ja neljä tapaa miten sitä voisi käyttää yliopiston opetuksessa. Lopuksi kerron JSBE:n politiikasta ja miksi päädyimme näihin linjauksiin. Käyttötavat: 1) Huijaaminen kirjoituttamalla kokonaisia esseetehtäviä tekoälyllä. 2) Artik...
Linear model implies a covariance matrix matrix
Просмотров 454Год назад
The video explains the application of path analysis tracing rules for calculating covariances in statistical models. It highlights the extension of these rules, originally used for correlations, to also encompass covariances. The significance of two-headed arrows in these rules is emphasized, indicating their role in quantifying correlations, covariances, and variances. The video demonstrates h...
Modification indices
Просмотров 1,3 тыс.Год назад
Modification indices
Empirical identification checks (with Stata)
Просмотров 554Год назад
Empirical identification checks (with Stata)
Convergence diagnostics workflow
Просмотров 304Год назад
Convergence diagnostics workflow
Hessian matrix and model convergence
Просмотров 717Год назад
Hessian matrix and model convergence
Starting values and model convergence
Просмотров 1,7 тыс.Год назад
Starting values and model convergence
Non-convergence in structural equation models and other models
Просмотров 1,4 тыс.Год назад
Non-convergence in structural equation models and other models
How to practice troubleshooting non converged models
Просмотров 1,7 тыс.Год назад
How to practice troubleshooting non converged models
Introduction to model convergence playlist
Просмотров 642Год назад
Introduction to model convergence playlist
Biggest problem and learning more
Просмотров 471Год назад
Biggest problem and learning more
Reading research articles
Просмотров 2,1 тыс.Год назад
Reading research articles
Troubleshooting Zotero references in a document
Просмотров 6 тыс.Год назад
Troubleshooting Zotero references in a document
AoM Meeting 2022 live presentation about modelling of entrepreneurial orientation.
Просмотров 1,3 тыс.2 года назад
AoM Meeting 2022 live presentation about modelling of entrepreneurial orientation.
Conditional fixed-effects logistic model
Просмотров 6 тыс.2 года назад
Conditional fixed-effects logistic model
Multilevel model in matrix form
Просмотров 1,7 тыс.2 года назад
Multilevel model in matrix form
Numerical integration in ML estimation
Просмотров 3012 года назад
Numerical integration in ML estimation
Factor scores
Просмотров 8 тыс.2 года назад
Factor scores
Errors in variables regression
Просмотров 1,8 тыс.2 года назад
Errors in variables regression

Комментарии

  • @ltang
    @ltang 16 часов назад

    Around 7:49 are farmers less prestigious than the model predicted or more? What does sitting below the y=x line mean?

    • @mronkko
      @mronkko 15 часов назад

      They are more prestigious. Check the residual on the y-axis. Anything above zero is respected more than what the model predicts.

    • @ltang
      @ltang 15 часов назад

      @@mronkkoSo it is below the y=x line just means that theoretically on that percentile we would expect the residual to be even higher? What does the theoretical percentile mean? Is it just based on rank

  • @simonakpadaka2891
    @simonakpadaka2891 День назад

    Many thanks for the insightful content. Permit me to ask a not too related question, what is the difference between Robust Regression (rreg) and vce (robust) Regression in stata.

    • @mronkko
      @mronkko 17 часов назад

      Robust regression is a technique that automatically drops or down-weights outliers. vce option defines how standard errors are calculated. Vce(robust) calculates heteroskedasticity robust standard errors. I personally never use robust regression because I think that I am better at handling outliers than the computer. Outliers can also result from non-linear functional form, in which case regression + diagnostic plots will lead to better models than any automated approach. See my video on outliers. Here is an explanation by Claude: "To explain the difference between Robust Regression (rreg) and vce(robust) regression in Stata, let me break it down for you: 1. Robust Regression (rreg): Robust regression, implemented with the 'rreg' command in Stata, is a method designed to be less sensitive to outliers compared to ordinary least squares (OLS) regression. It uses an iterative process to down-weight observations that have large residuals. The key points are: - It's an alternative estimation method, not just a way to calculate standard errors. - It's particularly useful when your data contains influential outliers. - The coefficients produced by rreg can be quite different from OLS, especially if there are significant outliers. - It uses Huber weights and biweights in its iterations to reduce the influence of outliers. 2. vce(robust) Regression: The 'vce(robust)' option in Stata, which can be added to various regression commands like 'regress', doesn't change the coefficient estimates. Instead, it changes how the standard errors are calculated. Key points: - It uses the Huber-White sandwich estimators for variance. - The coefficient estimates remain the same as in OLS. - It's robust to heteroskedasticity (unequal variance of errors across observations). - It doesn't specifically address the issue of outliers in the way that rreg does. Main differences: 1. Estimation method: 'rreg' changes how coefficients are estimated, while 'vce(robust)' only changes standard error calculations. 2. Outlier handling: 'rreg' is specifically designed to handle outliers by down-weighting them, while 'vce(robust)' doesn't address outliers directly. 3. Heteroskedasticity: 'vce(robust)' is primarily used to address heteroskedasticity, while 'rreg' is more focused on outlier resistance. 4. Interpretation: Coefficients from 'rreg' might be interpreted differently due to the weighting process, while 'vce(robust)' allows standard OLS interpretation of coefficients. In practice, the choice between these methods depends on your specific data issues and research questions. If outliers are a major concern, 'rreg' might be more appropriate. If you're mainly worried about heteroskedasticity and want to keep your OLS estimates, 'vce(robust)' could be the better choice. Would you like me to elaborate on any specific aspect of these methods?"

  • @George70220
    @George70220 День назад

    Omg your entire section is missing 💔 my condolences. Amazing and super useful speeches though!

    • @mronkko
      @mronkko День назад

      There is a 50 minute version of that talk on the channel under name "normality assumption" ;)

  • @kurtpiron838
    @kurtpiron838 3 дня назад

    Thanks for another great video. When creating a new scale, why do many researchers choose Principal Axis Factoring over Maximum Likelihood Factor Analysis, and is it a misconception that PAF handles violations of normality better?

    • @mronkko
      @mronkko 3 дня назад

      I am not aware of any research that would show that to be the case. My experience is that PAF gives you a solution more often than ML when the factor model fit is really bad, but I have not seen a connection to normality.

  • @AzharulIslam-mi8eu
    @AzharulIslam-mi8eu 3 дня назад

    Great, it was so helpful, thanks for this contribution 🤩

    • @mronkko
      @mronkko День назад

      Glad it was helpful!

  • @tl2uz
    @tl2uz 4 дня назад

    I would like to invite you to Korea wow

    • @mronkko
      @mronkko День назад

      Please do so. I havent done courses outside Finland after the pandemic, but I did a few before it.

  • @seeunkim4185
    @seeunkim4185 11 дней назад

    Thank you so much for your clarification, I am being helped organizing the concepts again by your explanation!!

    • @mronkko
      @mronkko 11 дней назад

      You are welcome!

  • @George70220
    @George70220 12 дней назад

    You're my favorite! Thanks again. After watching this paper I am beyond embarrassed at my paper. I'm pretty sure half of it is trying to deal with how non-normal my data was.

    • @mronkko
      @mronkko 12 дней назад

      I have quite a few postdocs who have taken my methods course after their doctorate who have felt the same ;)

    • @George70220
      @George70220 11 дней назад

      @@mronkko Are we supposed to go with the log-link if it works for more than 50% of variables? If we have 7 variables and 4 need logged, we should then use log-link? Do we just log-transform the values if only 3/7 variables needed it? I read your paper but don't see a multivariate situation like this.

    • @mronkko
      @mronkko 11 дней назад

      @@George70220 So you are saying that it looks like y is increasing linearly with x1 but exponentially with x2?

    • @George70220
      @George70220 10 дней назад

      @@mronkko yes

    • @mronkko
      @mronkko День назад

      Sorry for a delay. If x1 has a linear effect and x2 has an exponential effect, you need some kind of interaction of the two in the model. I talk about this kind of models on page 67 of journals.sagepub.com/doi/10.1177/1094428121991907 But as a practical matter, unless you have a good theoretical reason to combine a linear and an exponential effect, I would probably go for a linear model for both effects.

  • @justintondt5245
    @justintondt5245 13 дней назад

    Another misconception: the normality assumption for a t-test is even about the sample. The t-distribution is based on a normal sampling distribution (note sampling rather than sample). A sampling distribution is normal for either a normal population (which the sample may reflect with minimal sampling variability) or a large sample size.

    • @mronkko
      @mronkko 13 дней назад

      Right. A sample distribution is what is observed and cannot conform to any probability distribution. I am could have mentioned this in the talk, but cut a bit of corners to make it as simple as possible.

    • @justintondt5245
      @justintondt5245 11 дней назад

      ​@@mronkkoI think people often mix the following two questions: 1. Is the population mean an appropriate choice for the analysis? 2. Given we have chosen the population mean, can we reliably estimate it? Thank you for helping to disentangle these

    • @mronkko
      @mronkko 11 дней назад

      @@justintondt5245 Right. I would further add 3) how can we test if the mean is different from another mean or a fixed value.

    • @justintondt5245
      @justintondt5245 11 дней назад

      @@mronkko Yes, good point

  • @nl7247
    @nl7247 13 дней назад

    Thank you for your comprehensive explanations and tips.❤

    • @mronkko
      @mronkko 12 дней назад

      You are welcome!

  • @moazelsayed541
    @moazelsayed541 13 дней назад

    Thank you proffesor for this enlightening video!!! I always thought I should test for normality before conducting regression and even reviewres commented on my regression once for being on not-normal variable although my sample size was 900+

    • @mronkko
      @mronkko 13 дней назад

      You are welcome. I guess this video met a demand ;)

  • @alinik4737
    @alinik4737 13 дней назад

    A quick question, given this exolanation, then winsorization should also be unnecessary in regression analysis. Right?

    • @mronkko
      @mronkko 13 дней назад

      Short answer: Yes. Long answer: I am not a big fan of winsorization for two reasons: 1) I have never seen it discussed in a really good method book and I do not know the origin of the idea. (I have not looked very hard, though). 2) The procedure may produce erronneous outcomes. Let's assume that we do a survey of people's heights. A short 150cm person accidentally types their height as 1500 centimeters. Winsorization would make the person e.g. 205cm tall in the data. It is still incorrect. If you have outliers, rather than winsorizing the data, check if the outliers are potential data errors and fix where possible. Otherwise, consider deleting observations if they are truly non-representative.

  • @alinik4737
    @alinik4737 13 дней назад

    Great topic. Thank you Mikko!

    • @mronkko
      @mronkko 13 дней назад

      You are welcome.

  • @GeoLi2
    @GeoLi2 16 дней назад

    Wut

    • @mronkko
      @mronkko 16 дней назад

      I was asked to do this topic in Finnish.

  • @goncamert7375
    @goncamert7375 17 дней назад

    Thank you very much for the video. It is quite helpful. However, I want to ask you if mediation analysis works, when we have multiple (2 sequential count) mediators and two X variables (both are factor and one has 19, the other has 33 classes). How can we analyse this kind of problem? I would be appreciated any comments in advance.

    • @mronkko
      @mronkko 16 дней назад

      Yes, you can use mediation analysis in that case. If your X variables are unordered categories, I would recommend modeling X-M relationships with linear regression. Do you mean your model is X-->M1->M2->Y, or do the M1 and M2 work in parallel representing two different causal mechanisms? If you can justify linear functional forms between the variables, specifying a mediation model is simple. (note that count variables can be modeled with linear regression contrary to common belief, see journals.sagepub.com/doi/full/10.1177/1094428121991907).

    • @goncamert7375
      @goncamert7375 16 дней назад

      @@mronkko Thank you very much for your reply. I will try modelling linear relationship between X and M. My model is X-->M1->M2->Y. Thanks for the references indeed!

    • @mronkko
      @mronkko 16 дней назад

      @@goncamert7375 If you can model linear between M1-M2, then things are simple. If not, the approach that I explain should still work easily because of linear X->M1

  • @BrusselsMochi
    @BrusselsMochi 24 дня назад

    Thank you for sharing knowledge~

    • @mronkko
      @mronkko 23 дня назад

      You are welcome!

  • @dipmalyaroy987
    @dipmalyaroy987 26 дней назад

    Can you please share your slides for asrm 9

    • @mronkko
      @mronkko 24 дня назад

      Do you mean all missing data slides? I can easily provide you slides for a specific video and my goal is to link the slides to every video. However, I do not have a single set that would contain them all in one.

    • @mronkko
      @mronkko 23 дня назад

      @@dipmalyaroy987 nextcloud.jyu.fi/index.php/s/qXECY3a5GiJKM5B It contains everything that I have on the channel and more. Ultimately, I will have slides up at osf.io/c8qt6/ and linked to the video descriptions, but I have not had time to do it for all the videos yet.

  • @alonsowaltersenenromancauti
    @alonsowaltersenenromancauti Месяц назад

    Excellent video, just as a side note. Arellano is an Spanish name, so it's pronounced like "Areyano". Anyway good information was useful for me

    • @mronkko
      @mronkko Месяц назад

      Thanks. I have tried to learn how to pronounce Poisson (French name) correctly. Now I need to add Arellano to the list. On my defence, most non-Finns pronounce my last name incorrectly ;)

  • @mikorees5853
    @mikorees5853 Месяц назад

    Thanks! the rule of thumb is really helpful

    • @mronkko
      @mronkko Месяц назад

      You are welcome!

  • @soumyadwipdas5495
    @soumyadwipdas5495 Месяц назад

    Sir I am trying to estimate technical efficiency of farmers through stochastic frontier analysis using STATA. But whenever I try to run this model the result shows that “Iteration 14203: log-likelihood = 57.874872 (not concave)” and convergence not achieved r(430)”, sir please tell me how I solve this problem…

    • @mronkko
      @mronkko Месяц назад

      Model non-convergence can occur for many different reasons and it is impossible to say without having access to the model and the data. I recommend going through this playlist to see if you can figure out the problem ruclips.net/video/4_ip9F8EpVs/видео.html&pp=gAQBiAQB

  • @olllemand23
    @olllemand23 Месяц назад

    Thanks for a great video. Can you recommend any papers that uses the empirical test you mention at 12:28, in regards to testing the possible violation of the parallel trend assumption.

    • @mronkko
      @mronkko Месяц назад

      i cannot come up with any specific examples. However, if you search for "parallel trends" "Difference-in-differences" in google scholar, you should find lots of examples.

  • @sophie20324940
    @sophie20324940 Месяц назад

    Great explanation and plot. Only a minor recommendation: can you change the color of the independent variables? The current white color makes it very hard to see.

    • @mronkko
      @mronkko Месяц назад

      Good point. Unfortunately RUclips does not allow editing published videos, but I will adjust this on my slideset so that colors will be better if I re-record this ever.

  • @user-ex3vb4it8e
    @user-ex3vb4it8e 2 месяца назад

    Love U my teacher

    • @mronkko
      @mronkko 2 месяца назад

      Happy to hear that!

  • @Babette_Marty576
    @Babette_Marty576 2 месяца назад

    Hi Mikko. Thank you so much for posting those videos. Knowledge is to valuable in this day and age and for you to post it for free is very generous! I myself have little statistics knowledge and was hoping you could help me with some questions that I have: I want to investigate whether changes in one variable between two time points (dependent) can be explained by changes in another variable between two time points (independent). I have repeated measures for each participant. Hausman Test with a p-Value of 0.8 indicates that a random effects model is the better choice. My questions are: 1. I have a lot of observed variables that highly correlated with my independent variable. Do I need to include those in the model? Am I otherwise in violation of the assumption that the unobserved heterogeneity is uncorrelated with my regressor? 2. How do I interpret the result correctly? If the intercept is 0.5 and the regression coefficient is 4. Is it like in linear regression: For every one unit increase in the independent variable, the dependent variable changes by the coefficient? How do I interpret the intercept? Thank you to much in advance!

    • @mronkko
      @mronkko 2 месяца назад

      You are welcome. 1) If your research design indicates that the variables should be included, then they need to be included. If they are highly correlated with the IV of interest, this becomes even more important. Otherwise you may be in violation of uncorrelated error term. 2) Yes, linear multilevel models are interpreted like regression. The intercept is typically not interpreted.

  • @sven8880
    @sven8880 2 месяца назад

    Thank you so much for your videos, I am finding them extremely helpful while studying for an exam in “Quantitavive interpretation of tests” in Psychology. Greetings from Zagreb, Croatia!

    • @mronkko
      @mronkko 2 месяца назад

      You are welcome!

  • @vitorfernandes2406
    @vitorfernandes2406 2 месяца назад

    You always have very good videos, but this one was confusing.

    • @mronkko
      @mronkko 2 месяца назад

      Thanks for the feedback. Can you be more specific: what exactly did you find confusing about it? I will run a course next academic year where I will use this video and I might record it if I know what the pain points are.

  • @BrinderSadler
    @BrinderSadler 2 месяца назад

    A very informative video that is clear and uses examples so that viewers can better follow. Thank you.

    • @mronkko
      @mronkko 2 месяца назад

      You are welcome!

  • @BrinderSadler
    @BrinderSadler 2 месяца назад

    I am thrilled to have found this channel. I haven't studied econometrics since my first degree and now I'm trying to model a dynamic panel data model and your videos are helping my confidence - thank you! I would appreciate more videos on Stata coding for xtdpdml, too.

    • @mronkko
      @mronkko 2 месяца назад

      You are welcome. I do not do much software demos because I teach multiple software and because, particularly with Stata, the documentation explains the software well. Understanding the conceptual side is more challenging, and once you know that, specifying the command is not very challenging.

  • @sarahsinkerton8479
    @sarahsinkerton8479 2 месяца назад

    🎉

  • @kevon217
    @kevon217 2 месяца назад

    Gotta say, I’m loving these overviews. Jam packed with useful nuggets. Appreciate it!

    • @mronkko
      @mronkko 2 месяца назад

      You are welcome!

  • @kevon217
    @kevon217 2 месяца назад

    Excellent explanations! Appreciate this!

    • @mronkko
      @mronkko 2 месяца назад

      You are welcome!

  • @giovannamagri6489
    @giovannamagri6489 2 месяца назад

    I simply love you, you saved my life and my grades. THANK YOU

    • @mronkko
      @mronkko 2 месяца назад

      You are welcome.

  • @m.rafilaw6772
    @m.rafilaw6772 3 месяца назад

    Thank you, the explanation of each part of Multilevel models is very clear!

    • @mronkko
      @mronkko Месяц назад

      Glad it was helpful!

  • @rachaelcharlesvoiceovertal7589
    @rachaelcharlesvoiceovertal7589 3 месяца назад

    great vid but it doesn't show me How To ...

    • @mronkko
      @mronkko 3 месяца назад

      The "How to" part is software specific and typically you just pick a rotation method from a dropdown menu.

  • @aayushilyngwa7235
    @aayushilyngwa7235 3 месяца назад

    Hello. Thankyou very much for such detailed description on Gioia method. Just a query. Do we need any software to use this method?

    • @mronkko
      @mronkko 3 месяца назад

      You can do it with paper and pencil. I have colleagues how use Excel for coding and others use specialized software e.g. Atlas.ti

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 3 месяца назад

    this is really well explained and organized. i think i am finally getting it.

    • @mronkko
      @mronkko 3 месяца назад

      You are welcome!

  • @xuyang2776
    @xuyang2776 3 месяца назад

    Hello, Mikko. I have a question about SEM. In recent years, an increasing number of people have been using time series to build Structural Equation Models (SEMs). I wonder if it is appropriate to estimate parameters using the Maximum Likelihood method. I think there is an obstacle, since classic SEM theory requires that different observations are independent. For example, if I survey N people and ask each person for their math scores and English scores, while these scores may be correlated with each other, different individuals’ scores should be independent. This assumption, along with the assumption of normal distribution, ensures that the Maximum Likelihood estimator is consistent, asymptotically unbiased, and asymptotically efficient. It also allows the application of Z-tests and chi-square tests. However, time series data do not meet this assumption. For instance, consider two variables: GDP(t) and Investment(t). We can consider (GDP(2020), Investment(2020)) and (GDP(2021), Investment(2021)) as two observations (or two participants’ scores). These two observations are obviously correlated. Therefore, in such a case, if Maximum Likelihood estimation is adopted, will the estimators still possess these desirable properties? If not, is there any other method that could be used? Thank you very much.

    • @mronkko
      @mronkko 3 месяца назад

      It depends how SEM is used. In psychology where panels are short, data are often analyzed in wide format where each repeated obsertation is its own variable. See e.g. doi.org/10.1177/1094428117715509 In this case there is just one observation for individual and non-independence due to repeated observations is not an issue. If you model your data in long format where each repeated observation is a new case in your data, then non-independence is a problem if any of the latent variables (including error terms) correlate within individual. If this is the case, then ML is still consistent but test statistics can be misleading. But this can be solved by using cluster robust standard errors and tests statistics.

    • @xuyang2776
      @xuyang2776 Месяц назад

      @@mronkko Thank you very much. Your last sentenc is very impressive. That is "ML is still consistent but test statistics can be misleading. But this can be solved by using cluster robust standard errors and tests statistics". I want to know how to prove the consistency in this case? Or could you tell me the relative reference? Additionally, what software of SEM could provide cluster robust standard errors? Thanks again.

    • @mronkko
      @mronkko Месяц назад

      @@xuyang2776 I believe you can find the proof of consistency of ML in e.g. Lehman and Casella book on statistical estimation, but I do not think knowing the proof is useful for an applied researcher. Lavaan, Stata and Mplus, which are the softare that I use, all provide cluster robust standard errors.

    • @xuyang2776
      @xuyang2776 Месяц назад

      @@mronkko Thanks again. What I really want to know is the proof for the consistency even in the case that all the variables are auto correlation in a SEM model. Though I know in a regression model, the OLS estimater is still unbiase and consistent when the random iterm are auto-correlashion, SEM is not same. Additionally, if ML+cluster robust standard erros could resolve the dependency problem in a SEM, why dynamic factor analysis and dynamic SEM are developed?

    • @mronkko
      @mronkko Месяц назад

      @@xuyang2776 Dynamic factor analysis is designed for situations where you are interested in estimating effects of latent variables over time. Cluster robust SEs are work in situations where you are not interested in such dynamics, but want to estimate a cross-sectional model with data that are autocorrelated.

  • @ritbanoahmed1769
    @ritbanoahmed1769 3 месяца назад

    Subject: Request for Data Set and Guidance on Random Effects Model for Longitudinal Data Dear Mikko Rakko, I hope this message finds you well. My name is [Your Name], and I am currently a PhD student in a nursing program. I am reaching out to kindly request if you have a data set available that I could use to check assumptions on the random effects model for longitudinal data. I have been following your RUclips videos and they have been incredibly helpful in enhancing my understanding of various statistical concepts, including the assumptions regarding Random Effects Models (REMs) and others. Your guidance has been invaluable to me as I delve deeper into my research. I would greatly appreciate any assistance or advice you could provide in this regard. Your expertise in this field would be immensely beneficial as I continue my studies and research in nursing. Thank you for considering my request. I look forward to hearing from you soon. Warm regards, Ritbano Ahmed

    • @mronkko
      @mronkko 3 месяца назад

      This seems like an AI generated post given the placeholders [Your Name]. I think the best way to produce example datasets is to simulate them yourself. The second best would be to look at work by Paul Bliese. He has published a lot on multilevel models including example datasets.

  • @Youtuube304s
    @Youtuube304s 3 месяца назад

    Subscribed. Very good

    • @mronkko
      @mronkko 3 месяца назад

      You are welcome.

  • @will74lsn
    @will74lsn 3 месяца назад

    can I find somewhere examples of random coefficient models where the variable of the random coefficient is not continuous but categorical? ideally written with STATA or SPSS?

    • @mronkko
      @mronkko 3 месяца назад

      A random effect is a latent variable. A categorical latent variable is often referred to as a latent class. What you are looking for is a latent class model where the latent groups are allowed to differ on their path coefficients. Stata does latent classes, see www.stata.com/features/overview/latent-class-analysis/ SPSS does not.

  • @nokulungambona33
    @nokulungambona33 3 месяца назад

    Hi. What do you think about stationarity when dealing with GMM. Does it matter if you use system gmm even if data is not stationary?

    • @mronkko
      @mronkko 3 месяца назад

      System GMM requires mean stationarity. See doi.org/10.1080/00036846.2018.1540854

  • @heinzbongwasser2715
    @heinzbongwasser2715 4 месяца назад

    thanks bro

    • @mronkko
      @mronkko 4 месяца назад

      You are welcome

  • @benjecklin7806
    @benjecklin7806 4 месяца назад

    Smooth! Understandable, solid.

    • @mronkko
      @mronkko 4 месяца назад

      You are welcome!

  • @hkccp
    @hkccp 4 месяца назад

    fking cool, man

    • @mronkko
      @mronkko 4 месяца назад

      You are welcome!

  • @tranang5746
    @tranang5746 4 месяца назад

    I have a question as follows and the answer is c). Can you please help to explain why? The Spanish Army used to have a 2-year forced-conscription military service at 21 years of age. Because each year there were more conscripts that were needed, recruits could apply for military service exemption. Not surprisingly, there used to be more applications for exemptions than exemptions available, and the army used to make a lottery to randomly choose which applicants were awarded the exemption (and which applicants were forced to serve). A comparison of the earnings of exempted applicants and unsuccessful applicants some years after the service would allow us to evaluate: a) The average earnings effect of serving in the army. b) The average earnings effect of serving in the army for those who served c) The average earnings effect of serving in the army for those who did not serve

    • @mronkko
      @mronkko 4 месяца назад

      This is a really good question. You need to think about what the two groups are. You have everyone who did not serve, but only a subset of those that did. Therefore you cannot generalize to the full population or those that did serve. Here is a GPT4 generater explanation: The correct answer is c) "The average earnings effect of serving in the army for those who did not serve." This might initially seem counterintuitive, but it's based on a fundamental concept in econometrics known as the "Local Average Treatment Effect" (LATE). In the scenario described, there is a random assignment of military service exemptions through a lottery. This randomization ensures that, on average, the groups of exempted (treatment group) and non-exempted (control group) applicants are similar in all respects except for the treatment-here, serving in the military. This similarity is crucial because it mimics the conditions of a randomized controlled trial, allowing us to infer causality from the comparison between the two groups. Now, let's break down why each answer choice is what it is: a) "The average earnings effect of serving in the army" suggests we are looking at the impact on all individuals, regardless of their inclination or circumstances that led them to apply for the exemption. This is not what the comparison would reveal since the analysis only includes individuals who applied for the exemption and were subjected to the lottery system. b) "The average earnings effect of serving in the army for those who served" might seem like a reasonable answer, but it's not the focus of this comparison. This is because the analysis isn't solely focused on those who served; it also includes those who were exempted. The aim is to understand the impact of not serving (being exempted) versus serving. c) "The average earnings effect of serving in the army for those who did not serve" is correct because the analysis effectively compares individuals who wanted to be exempted (and thus, by extension, did not want to serve). Those who win the lottery (and are exempted) serve as the treatment group, and those who lose (and thus serve) are the control group. The comparison then reveals the effect of not serving on the group that applied for exemptions but had to serve due to losing the lottery. In essence, this setup allows us to estimate the impact of military service on those who would have preferred not to serve but were compelled to do so because they did not win the exemption lottery. It's a subtle but important distinction in understanding the causal effects in this scenario.

  • @pedrocolangelo5844
    @pedrocolangelo5844 4 месяца назад

    Sir, thank you so much for this video. You are amazing. I'd like to ask a trivial question, if you (or anyone who sees this) could answer me: when differentiating, why doesn't beta_0 get canceled as well? I mean, when we subtract y_t-1 on both sides, wouldn't it imply in getting a negative beta_0 canceling the positive beta_0 on the right hand side? Thank you so much!

    • @mronkko
      @mronkko 4 месяца назад

      First differencing does eliminate beta_0, but there is an error in my slides. My video on first differencing has the correct equation ruclips.net/video/hQWSh_j3Oy0/видео.html

  • @rubyanneolbinado95
    @rubyanneolbinado95 4 месяца назад

    Hi, why is R studio producing different results while using the same call.

    • @mronkko
      @mronkko 4 месяца назад

      I do not understand the question. What call are you referring to?

  • @junbeombahk3668
    @junbeombahk3668 4 месяца назад

    Hi! Why would we not use a t test in this case?

    • @mronkko
      @mronkko 4 месяца назад

      You could use t for testing a single constraint. But in practice we use F because it can handle both single constraint and multiple constraints. A single parameter F distribution is simply a square of the corresponding t distribution.

    • @junbeombahk3668
      @junbeombahk3668 4 месяца назад

      @@mronkko thanks!! Then why in some cases we should use a wald test instead of a t test like in this video?

    • @mronkko
      @mronkko 4 месяца назад

      @@junbeombahk3668 I am not using a t test in the video. The test is a single parameter Wald test. See this for an explanation of the tests ruclips.net/video/AbhwpFX2Xdw/видео.html

  • @alvarorodriguezrojas2615
    @alvarorodriguezrojas2615 4 месяца назад

    With “characteristic” you mean “the effects of treatment”? Thanks in advance!

    • @mronkko
      @mronkko 4 месяца назад

      No, I meant that we measure the characteristic of each individual after the treatment. We cannot measure the effects of treatment. We can only measure the characteristics of interest and then use these measured characteristics to calculate an estimate for the average treatment effect.