Supporting Statement B

SSB_Understanding Social Interactions and Sexual Behavior in the Military_20240724.docx

DoD-wide Data Collection and Analysis for the Department of Defense Qualitative and Quantitative Data Collection in Support of the Independent Review Commission on Sexual Assault Recommendations

Supporting Statement B

OMB: 0704-0644

Document [docx]
Download: docx | pdf


SUPPORTING STATEMENT – PART B

B.  COLLECTIONS OF INFORMATION EMPLOYING STATISTICAL METHODS

If the collection of information employs statistical methods, the following information should be provided in this Supporting Statement:

1.  Description of the Activity

Describe the potential respondent universe and any sampling or other method used to select respondents.  Data on the number of entities covered in the collection should be provided in tabular form for the universe as a whole and for each of the strata in the proposed sample.  Indicate the expected response rates for the collection as a whole, as well as the actual response rates achieved during the last collection, if previously conducted.

Male enlisted service members who are on active duty, in any branch of military service, and at the ranks of E1-E4, are eligible to participate in this research; we refer to this group as the sampling frame. Participation is restricted to male service members on active duty to allow for a uniform set of questions tailored to this population. The enlisted population and age range were identified as being at highest risk of sexually aggressive behavior, based on existing DoD data and civilian data (i.e., age ranges corresponding to E1-E4 ranks). The sampling frame contains roughly 400,000 individuals. Due to low expected response rates (10-15%), we expect to need to recruit from a fairly large sample. A power analysis was used to estimate the total sample size (number of respondents = 3000) required to maintain desired error rates for the analysis. The power analysis accounts for estimated response rate, estimated rates of sexually aggressive behavior, and other factors.

Strata

Army

Navy

Marine-Corps

Air-Force/ Space Force

Total

Sample Frame1

149823

90665

89204

98131

427823

Proportion

35%

21%

21%

23%

100%

Target Number of Respondents

1051

636

626

688

3000



2.  Procedures for the Collection of Information

Describe any of the following if they are used in the collection of information:

a.  Statistical methodologies for stratification and sample selection;

To ensure a representative population across services, we plan on stratifying the sample frame by service with unequal strata sample sizes, specifically allocating a smaller stratum size to the Space Force. In particular, our sample frame contains only around 1,500 individuals in the Space Force; after sampling, we will group Space Force individuals into the Air Force strata for analysis (see analysis plan). The goal of stratification is to ensure participation from each of the armed services, where Space Force is included with Air Force. Optimal stratification as a means to reduce variance in the estimators is challenging, as we are already restricting our sample frame to a single gender and small age range. Hence, we do not expect to stratify the sample frame further.

Sampling will be stratified random without replacement and will proceed in waves. Each of the sampled individuals will be recruited for the survey via email address provided by DoD. In order to draw these sample sizes, we must account for estimated participation rates while balancing limited funding to encourage participation; for this reason, we will sample in waves until the desired sample size is reached. The sampling waves will solicit participation from mutually exclusive collections of service members to prevent sampling any given service member more than once; the size of each wave may decrease as sampling approaches its goal, but each wave will contain at least 100 service members to minimize the chance of identifying a respondent. Before sampling begins, each individual in the sample frame will be assigned a study specific identifier to facilitate sampling in a way that prevents linking email addresses with administrative records. Details of the sampling procedure are provided in the appendix.

b.  Estimation procedures;

The goal of our analysis is to use the survey data to test for the existence of the relationships specified by our model. A secondary goal will involve exploratory analysis of additional relationships. The model applies Malamuth’s Confluence Model (Malamuth, 1993) to a military context and examines the relationship between sexual behavior and childhood trauma, past misconduct, attitudes about sex and gender, sexual harassment, and key contextual factors (i.e., workplace hostility, peer attitudes, alcohol use, social connectedness). We expect to encounter several analytical challenges that will require thoughtful planning of the statistical analysis.

Stratification weighting will be used to account for probability of selection into the sample from each stratum. Due to the merging of Space Force and Air Force strata, the probability of selection of a Space Force individual will be merged into the probability of selection of an Air Force service member to create an aggregate weight for those strata.

The survey combines several survey scales representing the following: social networks, connectedness, alcohol use, sexual behavior, sexual harassment, past misconduct, gender-related attitudes, workplace hostility, peer norms, identification with the military, adverse childhood experiences, and socially-desirable responding. We are also collecting broad demographic information to account for some group level confounding variation; however, due to the sensitive nature of the survey and our commitment to maintaining the anonymity of our respondents, we are not collecting small group information that may risk identifiability of any respondent. Due to this, we will not be able to capture the extent to which small group variation (such as within military units) explains responses. In some situations, omission of relevant heterogeneity from the model can bias estimates. Our analyses will be caveated with this information.

We will estimate the relationships in our model using linear structural equation modeling and logistic regression, where appropriate depending on the specification of the outcome variable. We believe that both the relationships of interest and non-response may be correlated with certain demographic characteristics, so we will collect and control for this information to the extent possible, with consideration to how collection of this information affects respondent anonymity. The information we ask for can be found in the attached survey. This information will be used directly in the analysis to control for confounding variation and also in the non-response bias analysis. We will document potential confounders that we are not able to control for due to the need to maintain respondent anonymity.

To account for multiple hypothesis testing, we will report Holm corrected p-values alongside other metrics for comparison. The Holm method allows us to control the family wise error rate while maintaining higher power for rejection of false null hypotheses than what is attainable with a traditional Bonferroni correction.

c.  Degree of accuracy needed for the Purpose discussed in the justification;

A power analysis was used to estimate the total sample size (number of respondents = 3000) required to maintain desired error rates for the analysis. The power analysis accounts for estimated response rate, estimated rates of sexually aggressive behavior, and other factors. The information available from previous related studies does not lend itself well to a power analysis, so we made some assumptions based on results from related studies to facilitate the power analysis. Sample size was determined such that statistical error rates were bounded at 1% type I error (99% confidence) and 10% to 20% type II error (80% to 90% power).

d.  Unusual problems requiring specialized sampling procedures; and

The sensitive nature of the questions being asked in the survey may induce nonresponse and nontruthful responses in a way that is correlated with the items being measured. We have a variety of mechanisms in place to account for this (see Section 2.b and Section 3 of this document), and we will address potential short comings transparently.

e.  Use of periodic or cyclical data collections to reduce respondent burden.

This is a one-time survey effort. Periodic or cyclical data collection will not be used to reduce respondent burden.

3.  Maximization of Response Rates, Non-response, and Reliability

Discuss methods used to maximize response rates and to deal with instances of non-response.  Describe any techniques used to ensure the accuracy and reliability of responses is adequate for intended purposes.  Additionally, if the collection is based on sampling, ensure that the data can be generalized to the universe under study.  If not, provide special justification.

Maximizing response rates

To incentivize survey completion, service members will be informed that participants may claim a gift certificate for a nominal amount ($15-$35) at the end of the survey and that to do so, they must complete the survey outside of duty hours on a personal device. DoD policy specifies that if research incentive payments/gifts are provided, service members must participate in the research during non-duty hours. Participants will complete a separate survey to enter their email address to receive a gift certificate, but this cannot be linked to their survey responses. Additionally, participants will receive multiple reminder e-mails to encourage participation.

Addressing non-response

We expect to encounter selection into response resulting in both unit and item non-response. We include information similar to what is used on a DoD survey (Workplace and Gender Relation Survey) to estimate propensity of response and alleviate bias due to non-response. We treat skipped responses to individual questions as a separate response category, and we plan on examining results with imputed item non-response in our robustness checks.2

Since we expect nonresponse to be related to characteristics of the respondents, we will conduct a nonresponse bias analysis to determine the extent to which estimates are affected by selection on observables into nonresponse (missing at random) as compared to missing completely at random. We will characterize differences in response rates by demographic variables collected in the survey. Finally, we will comment on the extent to which we believe the data may be affected by selection into nonresponse that is related to unobservable variation (e.g. missing not at random).

We plan on several methods to accommodate and account for potentially non-truthful responses. We include reminders about anonymity, and we will obtain a certification of confidentiality indicating to participants that their survey responses cannot be used in any Federal, State, or local civil, criminal, administrative, legislative, or other proceeding without their consent. We will include a social desirability scale in the survey to help assess the impact of nontruthful responses. Our results will be accompanied by caveats stating potential effects from non-truthful responses.

4.  Tests of Procedures

Describe any tests of procedures or methods to be undertaken.  Testing of potential respondents (9 or fewer) is encouraged as a means of refining proposed collections to reduce respondent burden, as well as to improve the collection instrument utility.  These tests check for internal consistency and the effectiveness of previous similar collection activities.

We held discussions with Service members and former Service members about the survey instrument. Service members provided feedback to ensure that the wording was clear, the questions were appropriate for the military, and the length and format were feasible for timely completion. Service members also provided feedback about our recruitment materials, including the language used in the e-mail to participants, the timing of the e-mail, the frequency of reminder e-mails, and other methods the study team could use to recruit participants.

5.  Statistical Consultation and Information Analysis

a. Provide names and telephone number of individual(s) consulted on statistical aspects of the design.

  • John W. Dennis, Institute for Defense Analyses, 703-845-2166, jdennis@ida.org

  • Mikhail Smirnov, Institute for Defense Analyses, 703-845-6945

  • Cullen Roberts, Institute for Defense Analyses, 703-845-2352

b. Provide name and organization of person(s) who will actually collect and analyze the collected information.

  • Dina Eliezer, Institute for Defense Analyses

  • Ashlie Williams, Institute for Defense Analyses

  • Juliana Esposito, Institute for Defense Analyses

  • Sujeeta Bhatt, Institute for Defense Analyses

  • John W. Dennis, Institute for Defense Analyses

  • John Kraus, Institute for Defense Analyses

  • Shimmy Nauenberg, Institute for Defense Analyses

  • Sarah Larimer, Institute for Defense Analyses

  • Anusuya Sivaram, Institute for Defense Analyses

  • Erin Eifert, Institute for Defense Analyses





            1. Sampling Procedure

Our proposed sampling plan for managing identifiability risk is as follows:

  1. IDA defines sample frame based on IDA’s Tier 3 DMDC holdings

    1. Sample frame is current E1-E4 male service members across all armed services (Army, Navy, Air Force, Marines, Space Force).

    2. IDA will provide a list of SSNSCR identifiers from the sample frame to DMDC, together with a small number of demographic variables for each individual that will facilitate a nonresponse bias analysis. IDA will verify that there are no sets of characteristics specific to 10 or fewer individuals within this sample frame.

  2. IDA will ask DMDC to return a copy of the sample frame in which a study-specific identifier has replaced the SSNSCR identifier for the individuals in the sample frame, and in which the rows have been reordered, in order to prevent reidentification.

    1. The study-specific identifier will be used to sample in waves without replacement until the desired number of respondents is reached

  3. For each sample wave j:

    1. IDA randomly draws (stratified on Service) Nj study specific identifiers associated with the sample frame without replacement and sends them to DMDC

    2. For each study specific identifier, IDA asks DMDC to provide the email address for those individuals. Each wave of email addresses can be provided as a group in a table that does not include the identifier so that no one outside of DMDC will be able to link the email addresses to the individual identifiers or demographic information.

  4. For the next wave, repeat steps 3.1 and 3.2, excluding already sampled individuals.

  5. Repeat steps 3 and 4 until desired number of respondents is reached for each stratum.



1 DMDC, “Active Duty Military Personnel by Service by Rank/Grade.” June 2023. https://dwp.dmdc.osd.mil/dwp/app/dod-data-reports/workforce-reports

2 Specifically, we currently plan to examine imputation with randomly imputed responses and responses imputed with the conditional modal response. These robustness checks may change depending on what the data looks like after collection.

4


File Typeapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
AuthorPatricia Toppings
File Modified0000-00-00
File Created2026-01-07

© 2026 OMB.report | Privacy Policy