Behind the Sarcasm: Exploring Sentiment in The Office

8 min readJan 13, 2025

Authors: Maddy Yip, Georgia Sherr, Adya Ganti, Michelle Sun, Daniel Song

Introduction

Sitcoms have long been a popular source of entertainment, offering a combination of deadpan humor and relatable moments that invoke both laughter and tears from all types of audience members. Sitcoms are primarily known for its irony and sarcasm-packed dialogue with emotional nuances audiences often find comfort in as they resonate with the heartfelt connections portrayed. Yet, modern technology may find it challenging to detect these tonal subtleties.

This article analyzes the sentiment trends of the widely beloved sitcom The Office, uncovering the emotional evolution of characters across the course of the show and offering insights on how effectively sentiment analysis can detect the complexities of sarcasm.

The following visuals in this article provide a look into the sentiment trends of various aspects of The Office. We utilized VADER to perform our sentiment analysis which evaluates given text and rates it on a scale of negative, neutral, or positive where 0 is considered neutral. The dataset we employed separated the dialogue such that each line spoken by a character was its own entry and was categorized into the season and episode it took place as well as the character that said it.

What Words Contributed the Most to Positive and Negative Sentiment?

The first aspect we wanted to analyze was which words contributed most to the positive and negative sentiments of The Office. To achieve this, we created bar charts representing both strong positive and strong negative sentiments. These visualizations specifically highlight the most frequent words with a VADER sentiment score exceeding 0.7 for strong positive sentiment and below -0.7 for strong negative sentiment.

The sentiment analysis was applied to individual lines of dialogue, with each line assigned a compound score representing a weighted sum of positive, negative, and neutral sentiment. Words such as “good”, “great”, and “love” highlight moments of joy and enthusiasm within the show, whereas terms such as “ill”, “bad”, and “terrible” underscore moments of conflict and intensity. By observing each individual word in the bar graph, we can see how specific language shapes the emotional expression within the dataset.

By further observing the contrast in the emotional tone within the words present, we can observe words under the “High Positive Sentiment” category express emotions of gratitude, resolution, humor, or admiration, whereas the words under the “High Negative Sentiment” category hint at conflict and criticism. The evidence of sentiment rich words within the dataset hints at its narrative significance, opening up possibilities of potential turning points, character developments, and overall character dynamics within the show itself.

One interesting trend we noticed was a noticeable distribution of words with a sarcastic tone within the list of the high negative sentiment words. Noting that sarcasm often reflects a more mixed sentiment rather than a pure negative sentiment, it highlighted the limitations of sentiment models like VADER, which cannot necessarily account for tone or situational context. Nonetheless, this visualization effectively pinpointed the linguistic drivers for sentiment, offering a valuable insight and analysis when observing narrative shifts and character developments within the given dialogues.

How do Sentiments Differ Among Various Characters?

After analyzing which words contribute to negative and positive sentiments, we can start to analyze the different sentiments between characters on The Office. The visualization above depicts the compound sentiment score for the characters Michael, Jim, Pam, Dwight, and Phyllis.

To perform our sentiment analysis, we used VADER, a sentiment model that evaluates given text and categorizes it into a negative, neutral, or positive score. Using a scale of -1 to 1 (with -1 demonstrating an extremely negative sentiment, 1 depicting an extremely positive sentiment, and 0 being a neutral sentiment), we were able to utilize VADER to determine the sentiments of each character.

As shown in the visualization, we analyzed each character’s compound sentiment score, which was calculated by taking the average of the positive, neutral, and negative scores.

With scores ranging between 0.11 and 0.14, the visualization demonstrates a mostly neutral score for these specific characters when conveying dialogue. Since the values are close to 0, we can conclude the overall sentiments of these specific characters is neutral.

By a small margin, Pam and Phyllis have a higher compound sentiment score with scores of 0.135 and 0.14 (respectively), compared to Michael, Jim, and Dwight, who have scores of 0.125, 0.12, and 0.118 (respectively). Although a small difference, this implies the female characters have a more positive sentiment.

However, there are some limitations of sentiment models. With models such as VADER, they do not account for tone or sarcasm in situational context. Nevertheless, this visualization clearly depicts the relationship between characters and their sentiments, highlighting possible character development over the seasons through their dialogues.

How does Sentiment Evolve over Seasons?

The above line graph depicts the clear sentiment evolution of the major characters in The Office across the show’s nine seasons. Key trends emerge from the data presented as the score ranges from a mildly negative sentiment score of about -0.5 (at the lowest) to a peak of a positive sentiment score of about 0.6 based on the VADER sentiment model used. The y-axis represents the average sentiment score, while the x-axis delineates the seasons, allowing for an examination of shifts in character mood and tone over time.

Several key trends emerge from the data. For instance, characters such as Michael (light blue) and Pam (the other blue) exhibit greater variability in sentiment scores compared to their co-stars. This variability mimics their prominent roles in the storyline and apparent emotional arcs. Michael’s sentiment fluctuates significantly, with occasional positive spikes that reflect his unpredictable, often comedic behavior, as well as moments of personal growth and redemption. Dwight’s trajectory, on the other hand, remains relatively stable, suggesting a more consistent portrayal of his stoic and eccentric nature. Jim and Pam, both of whom feature prominently in the series’ romantic subplot, generally maintain positive sentiment scores, though they experience occasional dips, likely reflecting moments of tension or conflict in their relationship. More antagonistic characters like Angela and Ryan display more consistently negative sentiment scores throughout the series. And a significant negative trend occurs in Seasons 7 and 8 across most characters indicating a major plot development (ie. Michael’s departure from the show) causing a shift in the emotional tone of the series.

Overall, the graph provides a quantitative representation of the show’s evolving character dynamics and emotional undertones, shedding light on how plot developments and character relationships influence sentiment throughout The Office.

How does Sentiment Differ between Female and Male Characters?

After a thorough comparison of sentiment trends among distinct characters throughout the show’s seasons, we are led to wonder how a deeper analysis may reveal differences in sentiment among the main male versus female characters of the show.

The following visual provides a line plot depicting the differences in average compound sentiment scores of the female and male characters for each season of The Office.

Overall, the male characters tend to have a more positive compound score but not by much. Both tend to follow similar trends with a stark decrease in score around seasons 5 and 6 and a steady increase until the end of the show. This implies that male characters in the sitcom tend to be portrayed in a slightly more positive emotional context. The most prominent difference can be seen in season 1 in which the men had a significantly higher compound score at around 0.17 and the women a compound score of around 0.107. However, it is important to note that season 1 only contains 6 episodes as opposed to the rest of the seasons that contain about 20 episodes, which may account for the larger difference in average compound scores. In general, the male characters tend to have a slightly higher compound score that follows a similar pattern as the female characters. This suggests the sitcom maintained a fairly balanced emotional tone amongst all characters and all (main) characters were engaged in significant emotional plot points, which is reflective of cohesive storytelling.

It can also be seen that the average compound scores are fairly neutral, never exceeding 0.17. This indicates that sentiment analysis models may have more trouble interpreting the emotional subtleties of sarcasm as it tends to mix both positive and negative words in each statement, which would cancel each other out in the compound score. Additionally, tone and situational irony are imperative when it comes to understanding sarcasm, something the model cannot detect from text alone.

Script generation using GPT-2

Building upon this exploration of sentiment trends and their inherent limitations, we delve into the application of fine-tuned language models to generate contextually accurate and character-consistent dialogue.

The figure above provides an example of the output from a GPT-2 model fine-tuned specifically on The Office scripts. By emulating the show’s iconic humor and character dynamics, the model generates multi-character dialogues marked with scene transitions like “ — Scene Start — ” and “ — Scene End — .” This format not only mirrors the original scripts but also serves as a foundation for analyzing how well the model captures the tonal and stylistic nuances of the show. Through this generated output, we can assess the interplay between characters, sentiment flow within dialogues, and the potential for the AI to replicate the subtle wit and sarcasm The Office is known for.

The output reflects the model’s ability to emulate the humor and character dynamics of the show. Dwight’s quirky and abrupt statements, Pam’s playful tone, and Jim’s sarcastic quips mirror the personalities portrayed in The Office. It exemplifies how fine-tuned language models can recreate stylistic nuances from focused datasets. The structured format allows researchers to assess language consistency, humor delivery, and character-specific tendencies, making it a valuable tool for creative exploration in scriptwriting. Minor inconsistencies or repetitions in the generated text highlight the challenges of ensuring contextual accuracy, yet overall, the model effectively demonstrates its capacity to imitate the source material’s style.

Conclusion

Analyzing the sentiment trends of television’s most celebrated sitcom gave us a glimpse into how equitable tonal shifts are imperative when it comes to cohesive and transformative storytelling. In conclusion, through our in-depth analysis of sentiment in The Office we drew the following unique insights:

Word choice matters! Specific words in dialogue contribute to high positive or negative sentiment and are often associated with a character’s expression of either enthusiasm or conflict.
Subtle differences in character sentiment trends reveal how characters are portrayed over time and reflect different narrative arcs.
Fine-tuned language models are effective at imitating stylistic scripts!
Sarcasm presents a challenge for sentiment analysis tools and consequently accentuates a need for more context-aware models.

Even so, sentiment analysis is a valuable tool that assists us in comprehending the emotional patterns and character dynamics in television, underscoring the intricate relationship between humor and sentiment that work to construct stimulating stories.