AsianPolmeth2022/day_two.Rmd at master · AsianPolmeth/AsianPolmeth2022 · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
---
title: "Program Day 2"
---

### January 7 (Wednesday)

```{r, echo=FALSE, include=TRUE}
# Poster Session

session_title <- c(
  "Session 5: Poster Session"
)

session_time <- c(
  "9:00–10:30 am"
)

session_zoom <- c(
  rep("https://snu-ac-kr.zoom.us/j/89206830215", length(session_time))
)

section_header <- fun_section_header(session_title, settion_time, session_zoom)


room <- c(
  "Breakout Room 1",
  "Breakout Room 2"
)

# moderator <- c(
#   "Moderator: Yuki Shiraito (University of Michigan, Ann Arbor)",
#   "Moderator: Ben Goldsmith (Australian National University)"
# )

moderator <- c(
  "Moderator: Yuki Shiraito",
  "Moderator: Ben Goldsmith"
)

authors <- c(
  ## Room 1
  "Seo-young Silvia Kim, Bernard L. Fraga, Bradley Spahn, Alan N. Yan.",
  "Michio Umeda.",
  "Matthew P. Robertson.",
  "Lucie Lu",
  ## Room 2
  "Jiongyi Cao, Kosuke Imai, Michael Lingzhi Li.",
  "Huaitian Lu, Xun Pang.",
  "Hsu Yumin Wang.",
  "Gechun Lin."
)

title <- c(
  # Room 1
  "When Do Voter Files Accurately Measure Turnout? How Transitory Voter File Snapshots Impact Research and Representation.",
  "Aggregating qualitative district-level campaign assessments to forecast the 2021 LH general election results in Japan.",
  "Bringing Big Data to Chinese Politics: Learning From the People's Daily Corpus.",
  "We Hear You: How do State-run Media Pay Attention to Online Public Opinion?",
  # Room 2
  "Experimental Evaluation of Dynamic Individualized Treatment Rules.",
  "A Causal-Predictive Machine Learning Method with Temporal Convolutional Networks for Panel Data.",
  "A Latent Measure of Mass Threats in Nondemocracies.",
  "When Counting Fails to Discover: Examining Short Text Similarity with A Meaning-Based Approach."
)


abstract <- c(
  #
  # Room 1
  #
  "",  # Kim et al.
  "Nowadays, poll aggregation is being conducted in the USA and European democracies for electoral forecasting. In Japan, however, this has not been the case because the news media report on electoral campaigns with qualitative assessments rather than poll numbers, although these assessments are based on extensive polling. Our study applies the approach Umeda (2021) developed, which aggregates the qualitative district-level election campaign coverage using the Item Response Theory, to the coming 2021 general election for Japan’s Lower House (LH) of the National Diet. Umeda (2021) applied the method to forecast the 2017 LH general election outcomes based on the media coverage available before the voting day. We examine the effectiveness of the approach by assessing the accuracy of the forecast against the actual results at the 2021 LH general election.",  # Umeda
  "Over the past decade the ability to collect massive datasets on political phenomena has allowed political scientists to ask and answer new questions. In the sub-field of Chinese politics, acquiring and analyzing large-scale text corpora is likely to become increasingly important given restrictions on fieldwork and threats to scholars' safety. Here, I present a new resource for the study of Chinese politics: a dataset of 75 years of articles in the People's Daily, comprising over 2 million records. I discuss the significance of this dataset in developing and testing theory about Chinese politics, introduce the data schema, and use the dataset to contribute new empirics to ongoing debates in Chinese politics: (1) I examine the salience of Xi Jinping in The People's Daily versus previous Communist Party leaders, including Mao Zedong, and (2) I propose a forecast model to explore whether characteristics of People's Daily editorials and news reports are predictive of imminent political purges. I then suggest other questions in Chinese politics the data could be used to address. The dataset will be made publicly available and be continuously updated on a public repository.",  # Hee
  "Winning citizens’ hearts and minds has long preoccupied autocrats, and scholars seem to agree that they have not been very skilled at it. While generations of scholars have pushed for a more nuanced view of elite communication in mobilizing voters in democracies, we know surprisingly little about how the dynamics of top-down communication shape the public in autocracies in an informational age. How do autocrats propagandize on social media so that audiences will not simply ignore, resist or ridicule their messages? I argue that autocrats are attentive to the audiences’ non-political interests even at the expense of promoting their political messages. Based on original data collection on a Twitter-like Chinese social media, Weibo, I attempt to show under what conditions do autocrats use state-controlled media to respond to the online public. I use a combination of machine learning algorithms (Spectral Clustering and Random Forests) to classify and analyze over 100, 000 trending searches and social media posts from key state-controlled media outlets. I show that specific social and entertainment topics are more likely to attract responses from the state-controlled media, while state-controlled media also engineer casual topics for public discussions. The state-controlled media’s political goals of disseminating propaganda are secondary to economic goals of gaining readership on social media. I demonstrate that 1) The state-controlled media leverage soft news to broaden their influence on social media; 2) Autocrats take cues from the online public and demonstrate responsiveness across issue spaces. The results challenge the traditional view of the state-controlled news outlets’ propaganda roles in authoritarian regimes.",  # Lucie Lu
  #
  # Room 2
  #
  "In recent years, machine learning algorithms have been used to develop individualized treatment rules (ITRs).  Applications of such methodology include personalized medicine and micro-targeting in business and politics.  What is lacking in the literature, however, is a robust way to evaluate the empirical performance of ITRs before implementing them in practice. Recently, Imai and Li (2021) introduced an experimental evaluation methodology that only relies upon the randomization of treatment assignment and random sampling of units without making any modeling assumptions. Thus, the methodology is applicable to ITRs that are derived using any generic machine learning algorithm. We extend this methodology to the dynamic ITRs in sequential multiple assignment randomized trials (SMART). We introduce an evaluation metric that decomposes the performance of a dynamic ITR into separate time periods while accounting for a budget constraint. We propose an unbiased estimator of this evaluation metric and derive its finite-sample variances. We conduct simulation studies to show that the confidence intervals based on the proposed finite-sample variance estimator have a good coverage even in a small sample size. Finally, we apply our methodology to the experimental data from the Tennessee's Student Teacher Achievement Ratio (STAR) project.",  # Cao, Imai, and Li
  "Recent developments of integrated causal-predictive machine learning models increase the learned representation of control units for counterfactual prediction while reducing model dependence. They have been increasingly used for causal inference with panel data to overcome the identification challenges arising from temporal and spatial dependencies. This research adopts the potential outcome framework and proposes a deep learning method using Temporal Convolutional Networks (TCNs) for causal inference with longitudinal data. This method is related to latent factor models and doubly-robust estimators, but it does not require proper model specifications or assume a linear history of the data-generating process (DGP). Compared to other machine learning approaches such as Recurrent Neural Network (RNNs), our TCN-based method exploits not only temporal dependence in data but also utilizes spatial relationships to learn a representation of control units. It is also more computationally efficient because the convolutional architecture allows parallel computing. We test the performance of the proposed method with simulated data. In the demonstration using empirical data, we re-analyze the example in Poulos and Zeng (2021): an RNN-based approach and its application estimating the causal effect of U.S. homestead policy on public school spending.", # Lu and Pang
  "Mass threats are a critical factor in explaining regime change and various political outcomes of authoritarian politics. However, the literature to date is divided over how to measure it in cross-national settings. To measure mass threats, numerous prior studies rely on measures related to economic grievances, whereas others emphasize the aspect of organizational capacity of mass mobilization. Additionally, substantial missing data remains a common problem of the existing measures of mass threats. In this paper, I propose a more comprehensive, latent measure of mass threats in non-democracies that seeks to bridge the divide. Utilizing a Bayesian dynamic latent variable approach, the model synthesizes information on manifest indicators from the two facets, generating time-series cross-sectional data of mass threats covering 138 authoritarian countries from 1970 to 2008. I conduct several checks to demonstrate the validity of the new measure and use it to replicate Svolik’s (2013) central results of the inverted U-shaped relationship between mass threats and military intervention.",  # Yumin
  "Political scientists have employed a variety of automatic methods to construct variables-of-interest from texts. The mainstream approaches rely on word counting to discover lexical or thematic similarity of corpora. However, they remain inaccurate for short texts due to the lack of contexts resulted by limited amounts of words. In recent years, short texts are increasingly used in political communication, delivering information crucial to modern politics. Given the need and the challenge of analyzing short texts, I propose a meaning-based approach to examine short texts based on semantic similarity. To implement that, I import a deep-learning model GAN-BERT which precisely predicts semantic relationship of text pairs. This model takes advantage of contextualized text representations produced by BERT and utilizes a semi-supervised learning framework called GAN, which reduces the percentage of labeled data required for good performance. As a demonstration, I apply the model to obtain pairwise similarity of news headlines of US Supreme Court decisions. The GAN-BERT predictions show that similar news headlines are correlated to unanimous decisions. Other traditional approaches (include cosine similarity and Smith-Waterman alignment scores) fail to detect such a relationship. The contributions are twofold. First, this paper identifies a neglected dimension of text similarity, which is useful for studying short texts. Second, I introduce political science a GAN framework to effectively conduct text analysis. The GAN-BERT model which measures semantic text similarity will enable a broad set of studies in political science, such as polarization and information spread on social media." # Genchun Lin
)

poster_link <- c(
  # Room 1
  "",
  "",
  "",
  "",
  # Room 2
  "",
  "",
  "",
  ""
)

event_name <- paste0(
  "<details> <summary>",
  authors, " _", title, "_",
  "</summary> <blockquote>", abstract, "</blockquote> </details>"
)

events <- if_else(
  poster_link != "",
  paste0(event_name, " ", "([poster](", poster_link, "))"),
  event_name
)


schedule_tab <- tibble::tibble(
  order = rep(1:4, 2),
  Time = rep(section_header, 8),
  Room = c(rep("Room1", 4), rep("Room2", 4)),
  Events = events
)

schedule_tab %>%
  select(order, Room, Events) %>%
  tidyr::pivot_wider(id_cols = order, values_from = Events, names_from = Room) %>%
  mutate(
    ` ` = rep("", 4)
  ) %>%
  select(-order) %>%
  relocate(` `) %>%
  add_row(
    ` ` = " ",
    Room1 = room[1],
    Room2 = room[2],
    .before = 1) %>%
  add_row(
    ` ` = " ",
    Room1 = moderator[1],
    Room2 = moderator[2],
    .after = 1) %>%
  kable(col.names = NULL, escape = FALSE) %>%
  kable_styling(full_width = TRUE, font_size = 20) %>%
  row_spec(
    1:2, align = "center"
  ) %>%
  #
  #
  pack_rows(
      index = schedule_tab$Time[1],
      label_row_css = "border-bottom: 3.5px solid;",
    ) %>%
  column_spec(2, width = "24em", background = "#fafafa") %>%
  column_spec(3, width = "24em", background = "#ececec")
```


```{r echo=FALSE, include=TRUE}
session_title <- c(
  "Break",
  "Session 6: Deep learning and Principal Strata Effect",
  "Break",
  "Session 7: Big Data and Causal Ordering",
  "Break",
  "Session 8: Analysis of Treaty Text and Preference",
  "Concluding Remarks"
)

session_time <- c(
  "10:30-10:40 am",
  "10:40 am -12:10 pm",
  "12:10–1:10 pm",
  "1:10-2:40 pm",
  "2:40–2:50 pm",
  "2:50–4:20 pm",
  "4:20 pm"
)

session_zoom <- c(
  rep("https://snu-ac-kr.zoom.us/j/89206830215", length(session_time))
)

section_header <- fun_section_header(session_title, settion_time, session_zoom)

authors <- c(
  "Tea Break", # 1
  # Session 6
 "Zhenhua Wang, Olanrewaju Akande, Jason Poulos, and Fan Li.",  # 2
 "Cyrus Samii, Ye Wang, Junlong Aaron Zhou.",  # 3
 "Discussant:  Kentaro Fukumoto (Gakushuin University)",  # 4
 #
 "Lunch break",  # 5
 # Session 7
 "Haohan Chen, Yiqiang Wang, Tony Zirui Yang.",  # 6
 "David Carlson, Abdulhakim Özcan.",  # 7
 "Discussant: Inbok Rhee (KDI School of Public Policy and Management)",  # 8
 #
 "Tea break",  # 9
 # Session 8
 "Soo Yeon Kim, Thiyaghessan s/o Poongundranar.",  # 10
 "ByeongHwa Choi and Yesola Kweon.",  # 11
 "Discussant: Sung Eun Kim (Korea University)",  # 12
 #
 "Concluding Remarks"  # 13
)

title <- c(
  "",  # 1
  # Session 6
 "Are deep learning models superior for missing data imputation in surveys? Evidence from an empirical comparison.",  # 2
 "Generalizing Covariate-tightened Trimming Bounds for Principal Strata Effects Using Adaptive Kernels.",  # 3
 "",  # 4
 #
 "",  # 5
 # Session 7
 "Emotional Propaganda: An Audio-As-Data Approach to Chinese State-Run Cable News.",  # 6
 "Estimating Causal Orderings and Relationships with Causal Gaussian Processes.", # 7
 "",  # 8
 #
 "",  # 9
 # Session 8
 "The Language of Institutional Design: Text Similarity in Preferential Trade Agreements.",  # 10
 "Age and Trade Policy Preferences in an Aging Society: Evidence from Japan.", # 11
 "",  # 12
 #
 ""  # 13
)


event_name <- if_else(title != "",
  paste0(authors, "\n
  _", title, "_"),
  paste0(authors)
)


paper_link <- c(
  "",  # 1
  "https://drive.google.com/file/d/141Oa1XixVsYB52MuU08lNfhp7kjklPei/view?usp=sharing",  # 2
  "https://drive.google.com/file/d/1BpGSs3p1l4onVObwPBakajmfm_qf1p-v/view?usp=sharing",  # 3
  "",  # 4
  "",  # 5
  "https://drive.google.com/file/d/1r-sCqbL0luaXewQTUJWWnABVkQKJGRCK/view?usp=sharing",  # 6
  "https://drive.google.com/file/d/1jNB1vVEHuazqvJhIqK_9S8p9sWIZO-M4/view?usp=sharing",  # 7
  "",  # 8
  "",  # 9
  "https://drive.google.com/file/d/1jA_bJEyGhYDGYFzlEJJ9NG5Nf0Suri6g/view?usp=sharing",  # 10
  "https://drive.google.com/file/d/1-R1GQmKT6wIEpUvDrL0ozaoWJ0nFxcZB/view?usp=sharing",  # 11
  "",  # 12
  ""  # 13
)

events <- if_else(
  paper_link != "",
  paste0(event_name, " ", "([paper](", paper_link, "))"),
  event_name
)

schedule_tab <- tibble(
  Time = c(
      section_header[1],
      rep(section_header[2], 3),
      section_header[3],
      rep(section_header[4], 3),
      section_header[5],
      rep(section_header[6], 3),
      section_header[7]
    ),
  Time_fill = rep("", length(events)),
  Time_start = c(
    "",
    # Session 6
    "10:40 am-", "11:10 am-", "11:40 am-",
    "",
    # Session 7
    "1:10 pm-", "1:40 pm-", "2:10 pm-",
    "",
    # Session 8
    "2:50 pm-", "3:20 pm-", "3:50 pm-",
    ""
  ),
  Events = events
)

schedule_tab %>%
  select(Time_start, Events) %>%
  kable(col.names = NULL) %>%
    kable_styling(full_width = TRUE, font_size = 20) %>%
    column_spec(1, bold = FALSE) %>%
    column_spec(2,
                background = c(
                  "#BCC5C5",
                  rep("#E5E5E5", 3),
                  "#BCC5C5",
                  rep("#E5E5E5", 3),
                  "#BCC5C5",
                  rep("#E5E5E5", 3),
                  "#2c3e50"
                  ),
                color = c(
                  "white",
                  rep("black", 3),
                  "white",
                  rep("black", 3),
                  "white",
                  rep("black", 3),
                  "white"
                  )) %>%
    pack_rows(
      index = table(forcats::fct_inorder(schedule_tab$Time)),
      label_row_css = "border-bottom: 3.5px solid;",
    ) %>%
  column_spec(1, width = "7em")
```