New Zealand Statistical Association 2024 Conference


Thomas Yee

University of Auckland

Heaping and seeping, GAITD regression and doubly constrained reduced rank vector generalized linear models, in smoking studies


This is joint work with Luca Frigau, Chenchen Ma

Large-scale health surveys suitable for addiction studies furnish self-reported data that consequently suffer from a form of measurement error called heaping. Also known as digit preference, the aberration is often characterized by spikes at multiples of 10 or 5 upon rounding. To date methods and software for heaped and seeped data have been largely wanting. Identifying three generic problems for simple addiction studies, we solve them by a newly developed technique called Generally Altered, Inflated, Truncated and Deflated regression for counts applied to the most recent NHANES data. In conjunction, we propose the class of Doubly constrained Reduced-rank VGLMs to allow the dimension reduction further simplification. We determine the distribution of smoking initiation age (SIA) and its association with tobacco consumption and smoking duration, e.g., is a lower SIA associated with higher tobacco consumption later in life? Is higher SIA associated with shorter smoking duration among quitters? Together, GAITD regression and DRR-VGLMs hold promise for heaped and seeped data.

Copyright © 2024 Victoria University of Wellington. All Rights Reserved.

Log In