### Work Experience

• 2020 2020

Amazon

• 2020 2020

Nvidia

• 2020 2017

#### Research Assistant

University of North Carolina at Charlotte, Charlotte Visualization Center

• 2016 2016

#### Data Analyst

Walmart Corporation, Global Shared Services,Charlotte

• 2013 2013

#### Visiting Research Intern

Indian Institute of Technology, Madras , Department of Management Sciences

### Education

• Ph.D. Current

Ph.D. in Computer Science

University of North Carolina at Charlotte

• M.S.2015

Computer Science

University of North Carolina at Charlotte

• B.E.2014

Computer Science and Engineering

Sri Venkateswara College of Engineering

### News, Awards and Grants

• Oct 2020
1 paper accepted at Findings of EMNLP and 1 paper accpeted at INLG 2020
• Sept 2020
Research Internship at Amazon Alexa AI under the guidance of Behnam Hedayatnia and Dilek Hakkani-Tur
• May 2020
Research Internship at NVIDIA under the guidance of Wei Ping, Mohammad Shoeybi, Raul Puri and Mostafa Patwary
• April 2020
A new paper titled "Modeling Conversation Context by Adapting Cognitive Architectures" published at the BAICS workshop, ICLR 2020
• April 2020
Two papers accepted at the STOC workshop associated with the LREC conference 2020
• Dec 2019
A new paper titled "Studying the effects of Cognitive Biases in Evaluation of Conversational Agents" published at SIGCHI, 2020 conference

*Honorable Mention Award*

• Nov 2019
A new paper titled "Detecting Asks in Social Engineering Attacks: Impact of Linguistic and Structural Knowledge" published at AAAI, 2020 conference
• Sept 2019
Two papers accepted at the conference on computational creativity in NLG and discourse structure in NLG.
• Sept 2019
New Paper on Emotional Language Generation using Reinforcement Learning accepted at ICMLA 2019
• June 2019
New Paper on experiment design for dialogue system evaluation accepted at INLG 2019
• June 2019
Conversational Intelligence Summer School (Best Project)
Received the best project award during the CISS School for building a visual dialog game playing bot and completed an ablation study on the VisDial 2019 challenge.
• June 2019
Conversational Intelligence Summer School (Selected)
Shortlisted to take part in the 2nd Conversational Intelligence Summer School.
• Oct 2018
Receipient of the Udacity PyTorch scholarship award provided by Facebook.
• June 2018
Student Scholarship Award
Receipient of the student scholarship award for ICWSM 2018 organized by Association for the Advancement of Artificial Intelligence (AAAI).

• ### Dialogue Systems

Projects:

• Local Knowledge Powered Conversational Agents
• Learning to Plan and Realize Separately for Open-Ended Dialogue Systems
• Using Reinforcement Learning with External Rewards for Open-Domain Natural Language Generation
• Emotional Neural Language Generation Grounded in Situational Contexts
• A survey of natural language generation techniques with a focus on dialogue systems-past, present and future directions
• Modeling Conversation Context by Adapting Cognitive Architectures
• ### Evaluation of Dialogue Systems

Projects:

• Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions
• Studying the Effects of Cognitive Biases in Evaluation of Conversational Agents
• Towards best experiment design for evaluating dialogue system output
• ### Countering Social Engineering Attacks

Projects

• Active Defense Against Social Engineering: The Case for Human Language Technology
• The Panacea Threat Intelligence and Active Defense Platform
• Adaptation of a Lexical Organization for Social Engineering Detection and Response Generation
• Detecting Asks in Social Engineering Attacks: Impact of Linguistic and Structural Knowledge.
• ### Cognitive Biases in Decision Making

Projects

• Investigating Effects of Visual Anchors on Decision‐Making about Misinformation
• Vulnerable to misinformation? Verifi!
• Can You Verifi This? Studying Uncertainty and Decision-Making About Misinformation Using Visual Analytics.
• Anchored in a data storm: How anchoring bias can affect user strategy, confidence, and decisions in visual analytics

### Filter by type:

Sort by year:

#### Using Reinforcement Learning with External Rewards for Open-Domain Natural Language Generation

Vidhushini Srinivasan, Sashank Santhanam, Samira Shaikh
Journal of Intelligent Information SystemsJournal Paper

#### Abstract

We propose a new approach towards emotional natural language generation using bidirectional seq2seq model. Our goal is to generate emotionally relevant language that accommodates the emotional tone of the prior context. To incorporate emotional information, we train our own embeddings appended with emotion values through valence, arousal and dominance scores. We use a reinforcement-learning framework, which is tuned using policy gradient method. Two of the internal rewards in our reinforcement learning framework, viz. Ease of Answering and Semantic Coherence are based on prior state-of-the-art. We propose a new internal reward, Emotional Intelligence, computed by minimizing the affective dissonance between the source and generated text. We also train a separate external reward analyzer to predict the rewards as well as to maximize the expected rewards (both internal and external). We evaluate the system on two common corpora used for Natural Language Generation tasks: the Cornell Movie Dialog and Yelp Restaurant Review Corpus. We report standard evaluation metrics including BLEU, ROUGE-L and perplexity as well as human evaluation to validate our approach. We demonstrate the ability of proposed model to generate emotionally appropriate responses on both corpora.

#### A Survey of Natural Language Generation Techniques with a Focus on Dialogue Systems - Past, Present and Future Directions

Sashank Santhanam, Samira Shaikh
Journal Paper Arxiv

#### Abstract

One of the hardest problems in the area of Natural Language Processing and Artificial Intelligence is automatically generating language that is coherent and understandable to humans. Teaching machines how to converse as humans do falls under the broad umbrella of Natural Language Generation. Recent years have seen unprecedented growth in the number of research articles published on this subject in conferences and journals both by academic and industry researchers. There have also been several workshops organized alongside top-tier NLP conferences dedicated specifically to this problem. All this activity makes it hard to clearly define the state of the field and reason about its future directions. In this work, we provide an overview of this important and thriving area, covering traditional approaches, statistical approaches and also approaches that use deep neural networks. We provide a comprehensive review towards building open domain dialogue systems, an important application of natural language generation. We find that, predominantly, the approaches for building dialogue systems use seq2seq or language models architecture. Notably, we identify three important areas of further research towards building more effective dialogue systems: 1) incorporating larger context, including conversation context and world knowledge; 2) adding personae or personality in the NLG system; and 3) overcoming dull and generic responses that affect the quality of system-produced responses. We provide pointers on how to tackle these open problems through the use of cognitive architectures that mimic human language understanding and generation capabilities.

#### Local Knowledge Powered Conversational Agents

Sashank Santhanam, Wei Ping, Raul Puri, Mohammad Shoeybi, Mostofa Patwary, Bryan Catanzaro
Poster PresentationArxiv, 2020 (upcoming NAACL submission)

#### Abstract

State-of-the-art conversational agents have advanced significantly in conjunction with the use of large transformer-based language models. However, even with these advancements, conversational agents still lack the ability to produce responses that are informative and coherent with the local context. In this work, we propose a dialog framework that incorporates both local knowledge as well as users' past dialogues to generate high quality conversations. We introduce an approach to build a dataset based on Reddit conversations, where outbound URL links are widely available in the conversations and the hyperlinked documents can be naturally included as local external knowledge. Using our framework and dataset, we demonstrate that incorporating local knowledge can largely improve informativeness, coherency and realisticness measures using human evaluations. In particular, our approach consistently outperforms the state-of-the-art conversational model on the Reddit dataset across all three measures. We also find that scaling the size of our models from 117M to 8.3B parameters yields consistent improvement of validation perplexity as well as human evaluated metrics. Our model with 8.3B parameters can generate human-like responses as rated by various human evaluations in a single-turn dialog setting.

#### Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions

David M. Howcroft, Anya Belz, Miruna Clinciu, Dimitra Gkatzia, Sadid A. Hasan, Saad Mahamood, Simon Mille, Emiel van Miltenburg, Sashank Santhanam, Verena Rieser
Poster PresentationConference Paper in 13th International Natural Language Generation Conference, 2020

#### Abstract

Human assessment remains the most trusted form of evaluation in NLG, but highly diverse approaches and a proliferation of different quality criteria used by researchers make it difficult to compare results and draw conclusions across papers, with adverse implications for meta-evaluation and reproducibility. In this paper, we present (i) our dataset of 165 NLG papers with human evaluations, (ii) the annotation scheme we developed to label the papers for different aspects of evaluations, (iii) quantitative analyses of the annotations, and (iv) a set of recommendations for improving standards in evaluation reporting. We use the annotations as a basis for examining information included in evaluation reports, and levels of consistency in approaches, experimental design and terminology, focusing in particular on the 200+ different terms that have been used for evaluated aspects of quality. We conclude that due to a pervasive lack of clarity in reports and extreme diversity in approaches, human evaluation in NLG presents as extremely confused in 2020, and that the field is in urgent need of standard methods and terminology.

#### Learning to Plan and Realize Separately for Open-Ended Dialogue Systems

Sashank Santhanam, Zhuo Cheng, Brodie Mather, Bonnie Dorr, Archna Bhatia, Bryanna Hebenstreit, Alan Zemel, Adam Dalton, Tomek Strzalkowski, Samira Shaikh
Poster PresentationConference Paper in Findings of EMNLP, 2020

#### Abstract

Achieving true human-like ability to conduct a conversation remains an elusive goal for open-ended dialogue systems. We posit this is because extant approaches towards natural language generation (NLG) are typically construed as end-to-end architectures that do not adequately model human generation processes. To investigate, we decouple generation into two separate phases: planning and realization. In the planning phase, we train two planners to generate plans for response utterances. The realization phase uses response plans to produce an appropriate response. Through rigorous evaluations, both automated and human, we demonstrate that decoupling the process into planning and realization performs better than an end-to-end approach.

#### Adaptation of a Lexical Organization for Social Engineering Detection and Response Generation

Archna Bhatia, Adam Dalton, Brodie Mather, Sashank Santhanam, Samira Shaikh, Alan Zemel, Tomek Strzalkowski, Bonnie J. Dorr
Poster PresentationConference Paper in Social Threats in Online Conversations, LREC, 2020

#### Abstract

We present a paradigm for extensible lexicon development based on Lexical Conceptual Structure to support social engineering detection and response generation. We leverage the central notions of ask (elicitation of behaviors such as providing access to money) and framing (risk/reward implied by the ask). We demonstrate improvements in ask/framing detection through refinements to our lexical organization and show that response generation qualitatively improves as ask/framing detection performance improves. The paradigm presents a systematic and efficient approach to resource adaptation for improved task-specific performance.

#### Active Defense Against Social Engineering: The Case for Human Language Technology

Adam Dalton, Ehsan Aghaei, Ehab Al-Shaer, Archna Bhatia, Esteban Castillo, Zhuo Cheng, Sreekar Dhaduvai, Qi Duan, Bryanna Hebenstreit, Md Mazharul Islam, Younes Karimi, Amir Masoumzadeh, Brodie Mather, Sashank Santhanam, Samira Shaikh, Alan Zemel, Tomek Strzalkowski, Bonnie J. Dorr
Poster PresentationConference Paper in Social Threats in Online Conversations, LREC, 2020

#### Abstract

We describe Panacea, a system that supports natural language processing (NLP) components for active defenses against social engineering attacks. We deploy a pipeline of human language technology, including Ask and Framing Detection, Named Entity Recognition, Dialogue Engineering, and Stylometry. Panacea processes modern message formats through a plug-in architecture to accommodate innovative approaches for message analysis, knowledge representation and dialogue generation. The novelty of the Panacea system is that uses NLP for cyber defense and engages the attacker using bots to elicit evidence to attribute to the attacker and to waste the attacker's time and resources

#### Studying the effects of Cognitive Biases in Evaluation of Conversational Agents *Honorable Mention Award*

Sashank Santhanam, Alireza Karduni, Samira Shaikh
Poster PresentationConference Paper in ACM CHI 2020, Hawaii

#### Abstract

Humans quite frequently interact with conversational agents. The rapid advancement in generative language modeling through neural networks has helped advance the creation of intelligent conversational agents. Researchers typically evaluate the output of their models through crowdsourced judgments, but there are no established best practices for conducting such studies. Moreover, it is unclear if cognitive biases in decision-making are affecting crowdsourced workers' judgments when they undertake these tasks. To investigate, we conducted a between-subjects study with 77 crowdsourced workers to understand the role of cognitive biases, specifically anchoring bias, when humans are asked to evaluate the output of conversational agents. Our results provide insight into how best to evaluate conversational agents. We find increased consistency in ratings across two experimental conditions may be a result of anchoring bias. We also determine that external factors such as time and prior experience in similar tasks have effects on inter-rater consistency.

#### Detecting Asks in Social Engineering Attacks: Impact of Linguistic and Structural Knowledge

Bonnie J. Dorr, Archna Bhatia, Adam Dalton, Brodie Mather,Bryanna Hebenstreit, Sashank Santhanam, Zhuo Cheng, Samira Shaikh, Alan Zemel,Tomek Strzalkowski
Poster PresentationConference Paper 34th AAAI Conference on Artificial Intelligence, New York

#### Modeling Conversation Context by Adapting Cognitive Architectures

Sashank Santhanam, Samira Shaikh
Poster PresentationConference Paper in Bridging AI and Cognitive Science, ICLR, 2020

#### Abstract

In this paper, we present an approach for language generation for open-domain dialog systems inspired by neurocognitive memory processes. We overcome the drawbacks of the traditional seq2seq architecture by augmenting the architecture with two types of memory, namely, long-term and working memory based on the Standard Model of cognition. We also implement a novel action selection mechanism that helps identify the relevant utterances containing salient information from long-term memory to working memory. To evaluate our model, we compare our action selection mechanism with the state-of-the-art baseline and observe improvements in the identification of most salient utterances over long conversations and our mechanism shows a higher correlation to human rankings.

#### Emotional Neural Language Generation Grounded in Situational Contexts

Sashank Santhanam, Samira Shaikh,
Oral PresentationConference Paper at 4th Workshop on Computational Creativity in Natural Language Generation, Oct 2019, Tokyo, Japan

#### Abstract

Emotional language generation is one of the keys to human-like artificial intelligence. Humans use different type of emotions depending on the situation of the conversation. Emotions also play an important role in mediating the engagement level with conversational partners. However, current conversational agents do not effectively account for emotional content in the language generation process. To address this problem, we develop a language modeling approach that generates affective content when the dialogue is situated in a given context. We use the recently released Empathetic-Dialogues corpus to build our models. Through detailed experiments, we find that our approach outperforms the state-of-the-art method on the perplexity metric by about 5 points and achieves a higher BLEU metric score.

Sashank Santhanam, Samira Shaikh,
Oral PresentationConference Paper at 12th International Natural Language Generation Conference, Oct 2019, Tokyo, Japan

#### Abstract

To overcome the limitations of automated metrics (BLEU, METEOR etc.) in evaluating dialogue systems, researchers typically use human judgments to provide convergent evaluation ratings. While prior research has shown that human judgments can suffer from inconsistency of ratings, extant research has also found that the design of the evaluation task affects the consistency and quality of human judgments. We conduct a between-subjects study to understand the impact of four experiment designs on human ratings of dialogue system output. In addition to discrete and continuous scale ratings, we also experiment with a novel application of Best-Worst scaling to dialogue evaluation. Through our systematic study with 40 crowdsourced workers in each task, we find that experiment design using continuous scales achieves more consistent ratings than Likert scale or ranking-based experiment design. Additionally, we find that factors such as time taken to complete the task and no prior experience of participating in similar studies of rating dialogue system output positively impact consistency and agreement amongst raters.

#### Investigating Effects of Visual Anchors on Decision-Making about Misinformation

Ryan Wesslen, Sashank Santhanam, Alireza Karduni, Isaac Cho, Samira Shaikh, Wenwen Dou
Oral PresentationConference Paper 21st International IEEE Conference on Visualization (EuroVis), Porto, Portugal, June 2019

#### Abstract

Cognitive biases are systematic errors in judgment due to an over-reliance on rule-of-thumb heuristics. Recent research suggests that cognitive biases, like numerical anchoring, transfers to visual analytics in the form of visual anchoring. However, it is unclear how visualization users can be visually anchored and how the anchors affect decision-making. To investigate, we performed a between-subjects laboratory experiment with 94 participants to analyze the effects of visual anchors and strategy cues using a visual analytics system. The decision-making task was to identify misinformation from Twitter news accounts. Participants were randomly assigned to conditions that modified the scenario video (visual anchor) and/or strategy cues provided. Our findings suggest that such interventions affect user activity, speed, confidence, and, under certain circumstances, accuracy. We discuss implications of our results on the forking paths problem and raise concerns on how visualization researchers train users to avoid unintentionally anchoring users and affecting the end result.

#### Vulnerable to Misinformation? Verifi!

Alireza Karduni, Isaac Cho, Ryan Wesslen, Sashank Santhanam, Svitlana Volkova, Dustin Arendt, Samira Shaikh, Wenwen Dou
Oral PresentationConference Paper 24th ACM Conference on Intelligent User Interfaces, Mar 2019

#### Abstract

We present Verifi2, a visual analytic system to support the investigation of misinformation on social media. Various models and studies have emerged from multiple disciplines to detect or understand the effects of misinformation. However, there is still a lack of intuitive and accessible tools that help social media users distinguish misinformation from verified news. Verifi2 uses state-of-the-art computational methods to highlight linguistic, network, and image features that can distinguish suspicious news accounts. By exploring news on a source and document level in Verifi2, users can interact with the complex dimensions that characterize misinformation and contrast how real and suspicious news outlets differ on these dimensions. To evaluate Verifi2, we conduct interviews with experts in digital media, communications, education, and psychology who study misinformation. Our interviews highlight the complexity of the problem of combating misinformation and show promising potential for Verifi2 as an educational tool on misinformation.

#### Can You Verifi This? Studying Uncertainty and Decision-Making about Misinformation in Visual Analytics

Alireza Karduni, Ryan Wesslen, Sashank Santhanam, Isaac Cho, Svitlana Volkova, Dustin Arendt, Samira Shaikh, Wenwen Dou
Oral PresentationConference Paper 12th International AAAI Conference on Web and Social Media, Stanford, USA, June 2018

#### Abstract

We describe a novel study of decision-making processes around misinformation on social media. Using a custom-built visual analytic system, we presented users with news content from social media accounts from a variety of news outlets, including outlets engaged in distributing misinformation. We conducted controlled experiments to study decision-making regarding the veracity of these news outlets and tested the role of confirmation bias (the tendency to ignore contradicting information) and uncertainty of information on human decision-making processes. Our findings reveal that the pres- ence of conflicting information, presented to users in the form of cues, impacts the ability to judge the veracity of news in systematic ways. We also find that even instructing participants to explicitly disconfirm given hypotheses does not significantly impact their decision-making regarding misinformation when compared to a control condition. Our findings have the potential to inform the design of visual analytics systems so that they may be used to mitigate the effects of cognitive biases and stymie the spread of misinformation on social media.

#### I Stand With You: Using Emojis to Study Solidarity in Crisis Events

Sashank Santhanam, Vidhushini Srinivasan, Shaina Glass, Samira Shaikh
Oral PresentationConference Paper 1st International Workshop on Emoji Understanding and Applications in Social Media (ICWSM), June 2018

#### Abstract

We study how emojis are used to express solidarity in social media in the context of two major crisis events - a natural disaster, Hurricane Irma in 2017 and terrorist attacks that occurred in November 2015 in Paris. Using annotated corpora, we first train a recurrent neural network model to classify expressions of solidarity in text. Next, we use these expressions of solidarity to characterize human behavior in online social networks, through the temporal and geospatial diffusion of emojis. Our analysis reveals that emojis are a powerful indicator of sociolinguistic behaviors (solidarity) that are exhibited on social media as the crisis events unfold.

#### I Stand With You: Detecting and Characterizing Expressions of Solidarity in Social Media

Sashank Santhanam, Samira Shaikh
Oral PresentationConference Paper 4th International Conference on Computational Social Science (IC2S2), Chicago, USA, July, 2018

#### Abstract

On November 13th 2015, attackers carried out suicide bombings and mass shootings at two separate locations in Paris, France. Over a hundred people were killed and scores more were injured in these incidents, now known as the November 2015 Paris attacks \footnote{https://tinyurl.com/pb2bohv}. People all over the world took to social media to express their reactions and convey their thoughts about this event. Expressions of solidarity formed a large proportion of these expressions made on social media. Solidarity is a sociological concept and a prosocial behavior, characterized by Bayertz in relation to complementary terms such as community spirit or mutual attachment, social cooperation or charity''. The collective enactment of online behaviors, including prosocial behaviors such as solidarity, has been known to directly affect political mobilization and social movements. There is thus a pressing need to understand what drives and, more importantly, what characterizes the convergence of a global public in online social networks, especially in the immediate aftermath of crisis events. We report on the first study to (a) create a corpus of social media posts annotated with respect to expressions of solidarity; and (b) develop and test computational models that automatically detect expressions of solidarity. We present initial analysis with respect to the content of solidarity expressions. Our approach and findings will help advance research in the dynamics of online mobilization

#### The Anchoring Effect in Decision-Making with Visual Analytics

Isaac Cho, Ryan Wesslen, Alireza Karduni, Sashank Santhanam, Samira Shaikh, Wenwen Dou
Oral PresentationConference Paper in 12th International IEEE Conference Visual Analytics Science and Technology, Phoenix, Arizona, Oct 2017

#### Abstract

Anchoring effect is the tendency to focus too heavily on one piece of information when making decisions. In this paper, we present a novel, systematic study and resulting analyses that investigate the effects of anchoring effect on human decision-making using visual analytic systems. Visual analytics interfaces typically contain multiple views that present various aspects of information such as spatial, temporal, and categorical. These views are designed to present complex, heterogeneous data in accessible forms that aid decision-making. However, human decision-making is often hindered by the use of heuristics, or cognitive biases, such as anchoring effect. Anchoring effect can be triggered by the order in which information is presented or the magnitude of information presented. Through carefully designed laboratory experiments, we present evidence of anchoring effect in analysis with visual analytics interfaces when users are primed by representation of different pieces of information. We also describe detailed analyses of users' interaction logs which reveal the impact of anchoring bias on the visual representation preferred and paths of analysis. We discuss implications for future research to possibly detect and alleviate anchoring bias.

#### Modeling Social Engineering Risk using Attitudes, Actions, and Intentions Reflected in Language Use

Adam Dalton, Alan Zemel, Amirreza Masoumzadeh, Archna Bhatia, Bonnie Dorr, Brodie Mather, Bryanna Hebenstreit, Ehab Al-Shaer, Ellisa Khoja, Esteban Castillo Juarez, Larry Bunch, Marcus Vlahovic, Peng Liu, Peter Pirolli, Rushabh Shah, Sabina Cartacio, Samira Shaikh, Sashank Santhanam, Sreekar Dhaduvai, Tomek Strzalkowski, Younes Karimi [ordered in alphabetical order]
Poster in ​32nd International FLAIRS Conference, Florida, USA, May 2019

#### Salient Context Identification from Memory for Neural Dialog Systems

Sashank Santhanam, Samira Shaikh
Poster in 2nd Southern Data Science Conference, Atlanta, USA, April 2019.

#### Abstract

Dialog Systems like Siri and Alexa are capable of producing a coherent response for a particular input utterance. These systems use prominent state-of-the-art techniques in deep learning like sequence-to-sequence (seq2seq) mechanisms and incorporate aspects of cognitive science. However, prior research done in this area has shown that engaging with these systems for longer interactions could result in dull and generic responses in open-domain situations due to their reliance on a shorter context of conversation - namely, the most recent utterance in the dialogue history. To address the challenge of identifying the salient context from conversation, we take a cognitive science approach. Our approach relies on an adaptation of the Standard Model, an established model of memory in cognitive science.

#### Propaganda or Clickbait? Understanding and Classifying Types of Misinformation using Recurrent Neural Networks

Sashank Santhanam, Samira Shaikh
Poster in ​1st Southern Data Science Conference, Atlanta, USA, April 2018

#### Understanding the emoji of solidarity

Sashank Santhanam, Samira Shaikh
Press in The Conversation, July 16, 2018

### Talks (last update April 2019)

• Dec 2018

#### Neural Networks for Natural Language Generation

Talk covering neural network approaches for Natural Language Generation and how Cognitive Architectures are helpful to address the drawback that current techniques suffer from. Slides will uploaded soon.

• Dec 2018

• April 2018

• April 2019