New Zealand Statistical Association 2024 Conference
Thomas Yee
University of Auckland
Heaping and seeping, GAITD regression and doubly constrained reduced rank vector generalized linear models, in smoking studies
This is joint work with Luca Frigau, Chenchen Ma
Large-scale health surveys suitable for addiction studies furnish
self-reported data that consequently suffer from a form of measurement
error called heaping. Also known as digit preference, the aberration
is often characterized by spikes at multiples of 10 or 5 upon
rounding. To date methods and software for heaped and seeped data
have been largely wanting. Identifying three generic problems for
simple addiction studies, we solve them by a newly developed technique
called Generally Altered, Inflated, Truncated and Deflated regression
for counts applied to the most recent NHANES data. In conjunction,
we propose the class of Doubly constrained Reduced-rank VGLMs to
allow the dimension reduction further simplification. We determine
the distribution of smoking initiation age (SIA) and its association
with tobacco consumption and smoking duration, e.g., is a lower SIA
associated with higher tobacco consumption later in life? Is higher
SIA associated with shorter smoking duration among quitters? Together,
GAITD regression and DRR-VGLMs hold promise for heaped and seeped data.
Log In