OIM3640 - Problem Solving and Software Design

2026 Spring

Session 16 (3/24)

contain

Today's Agenda

  • Announcements/Updates
  • What We've Learned So Far
  • Review Questions
  • Lecture: Tuples, Sets, and Choosing Data Structures
  • Practice: Text Analysis

Announcements/Updates

  • Welcome back from Spring Break!
  • Mini Project 2: Text Analysis - start your proposals!
  • Mini Project - keep working on elective projects
  • Communication
    • Office Hours: Walk-in or by appointment
    • Email: Specify course # in subject, e.g., "OIM3640: GitHub settings"
    • You are required to meet with me at least once this semester
  • Questions?

What We've Learned So Far

  • Python Fundamentals
    • Variables & types, functions, conditionals
  • Iteration: for, while, break, continue
  • Strings: indexing, slicing, immutability, methods
  • Lists: mutable sequences, aliasing, sorting
  • Dictionaries: key-value mappings, counting pattern

🔴 Live Demo: Real Stock Data

import yfinance as yf

stock = yf.Ticker('AAPL')
info = stock.info            # a dict!
print(info['shortName'])     # 'Apple Inc.'
print(info['currentPrice'])  # 229.87

We'll learn about APIs and libraries later in the course!

🙋 Review Questions

info = yf.Ticker('AAPL').info
# 'shortName', 'city', 'longBusinessSummary',
# 'sector', 'fullTimeEmployees', ...
  • What does info['longBusinessSummary'].split() give you?
  • What does 'iPhone' in info['longBusinessSummary'] return?
  • Can you do info['city'][0] = 'c'? Why not?
  • What does len(info) tell you?
  • Explore: what other keys does info have?

🙋 Review Questions (continued)

tickers = ['AAPL', 'NVDA', 'MSFT']
prices = {}
for t in tickers:
    prices[t] = yf.Ticker(t).info['currentPrice']
  • What does sorted(tickers) return? Does it change tickers?
  • How do you get the total value of all 3 stocks?
  • What does 'TSLA' in prices return?
  • How would you add 'GOOG' to both tickers and prices?
  • Explore: pull other fields and build a richer dict!

Chapter 11 - Tuples

What we'll learn:

  • Tuples as immutable sequences
  • Tuple assignment and unpacking
  • Returning multiple values
  • zip() for combining sequences

Tuples: Immutable Sequences

t = ('a', 'b', 'c', 'd')
t[0]       # 'a'
t[1:3]     # ('b', 'c')

You cannot modify them:

t[0] = 'X'   # TypeError! Tuples are immutable.

Tuple Assignment & Unpacking

a, b = b, a           # swap without a temp!

point = (3, 4)
x, y = point           # x = 3, y = 4

# Works great with dict.items()
for key, value in prices.items():
    print(f'{key}: ${value}')

Returning Multiple Values

Tuples let functions return more than one value:

def min_max(numbers):
    return min(numbers), max(numbers)

lowest, highest = min_max([3, 1, 4, 1, 5])
# lowest = 1, highest = 5

The return creates a tuple; unpacking assigns both at once.

Tuples as Dict Keys

You want to track prices by ticker and date:

# Try a list key?
prices = {['AAPL', '2026-03-24']: 229}  # TypeError!
# String key? Works, but messy...
prices = {'AAPL(2026-03-24)': 229}
# Tuple key ✓ clean and natural!
prices = {('AAPL', '2026-03-24'): 229}

zip(): Combine Two Sequences

names = ['AAPL', 'GOOG', 'MSFT']
prices = [182.30, 141.80, 415.20]

for name, price in zip(names, prices):
    print(f'{name}: ${price}')

# Create a dict from two lists
stock_prices = dict(zip(names, prices))

Sets: Unique Collections

A set holds unique items only, unordered:

watchlist = {'AAPL', 'NVDA', 'AAPL', 'MSFT'}
print(watchlist)   # {'AAPL', 'NVDA', 'MSFT'}

tickers = ['AAPL', 'NVDA', 'AAPL', 'MSFT', 'NVDA']
unique = set(tickers)    # {'AAPL', 'NVDA', 'MSFT'}
len(set(tickers))        # 3 unique stocks

Set Operations

my_stocks = {'AAPL', 'NVDA', 'MSFT', 'GOOG'}
your_stocks = {'MSFT', 'GOOG', 'TSLA', 'AMZN'}

my_stocks & your_stocks  # {'MSFT', 'GOOG'}  both own
my_stocks | your_stocks  # all 6 stocks combined
my_stocks - your_stocks  # {'AAPL', 'NVDA'}  only I own

Membership: 'TSLA' in my_stocks # False

Why Sets? Speed!

import timeit
words = open('data/words.txt').read().split()
word_set = set(words)     # 113K+ words

def search_list():
    return 'python' in words
def search_set():
    return 'python' in word_set

print('List:', timeit.timeit(search_list, number=1000))
print('Set: ', timeit.timeit(search_set, number=1000))
# List: 0.8500s  Set: 0.0003s

Curious why sets are faster? Ask AI!

Choosing Data Structures

Type Mutable? Ordered? Access by Duplicates?
str No Yes index Yes
list Yes Yes index Yes
tuple No Yes index Yes
dict Yes Yes key Keys: No
set Yes No membership No

When to Use What?

  • list: ordered collection, may have duplicates
    • Student grades, stock price history
  • tuple: fixed data that shouldn't change
    • GPS coordinates, returning multiple values
  • dict: look up values by a meaningful key
    • Stock prices by ticker, word frequency counts
  • set: unique items, fast membership testing
    • Unique words in a document, valid usernames

🙋 Which Data Structure?

You're building a music app:

  • Your favorites playlist (add, remove, reorder)
  • All artists a user has ever listened to (no repeats)
  • Song title → number of times played
  • A single song record: ('Bohemian Rhapsody', 'Queen', 1975)

Ch 12: Text Analysis

Let's practice! All data structures come together:

Step Data Structure
Store all words list
Count frequencies dict
Find unique words set
Sort by frequency list of tuples

Reading Words from a File

words = []
for line in open('jekyll.txt'):
    for word in line.split():
        words.append(word.strip().lower())

print(len(words))         # total words
print(len(set(words)))    # unique words (set!)

Counting and Sorting Words

freq = {}
for word in words:
    freq[word] = freq.get(word, 0) + 1

def second_element(t):
    return t[1]

top = sorted(freq.items(), key=second_element, reverse=True)
for word, count in top[:5]:
    print(f'{word}: {count}')

📝 Quiz 2 on Thursday

Topics: lists, tuples, dicts, sets, mutability, counting pattern

You may bring one cheat sheet (one page, both sides).

Python 3 Cheat Sheet for reference.

📝 Your Turn: Text Analysis

Try the text analysis code on a book from Project Gutenberg:

  • Count total words and unique words
  • Find the 10 most common words
  • Find words that appear exactly once

MP2 is about text analysis - take any text, find something interesting.
Start with pure Python, then explore libraries. Write your PROPOSAL.md!

📝 Your Turn: Ch 9 Lists

  • Exercise 9.2: is_anagram - check if two words are anagrams
  • Exercise 9.3: is_palindrome - find palindromes (7+ letters)
  • Exercise 9.4: reverse_sentence - reverse word order in a string
  • Exercise 9.5: total_length - sum lengths of all strings in a list

📝 Your Turn: Ch 10 Dicts

  • Exercise 10.2: Rewrite value_counts using get() (no if!)
  • Exercise 10.3: has_duplicates - check for repeated elements
  • Exercise 10.4: find_repeats - return keys with count > 1
  • Exercise 10.5: add_counters - combine two frequency dicts
  • Exercise 10.6: is_interlocking - split word into two valid words

📝 Your Turn: Ch 11 Tuples

  • Exercise 11.3: Caesar cipher (shift_word) - same idea as Puzzle 1!
  • Exercise 11.4: Sort letters by frequency (most_frequent_letters)
  • Exercise 11.5: Find all anagram groups from a word list
  • Exercise 11.6: word_distance - count differing positions (use zip)
  • Exercise 11.7: Find metathesis pairs (single letter swaps)

Questions? Ask now!

Before You Leave

  • Any questions?
  • Start your Learning Log for this week (logs/wk09.md)
  • Start your MP2 proposal (PROPOSAL.md)
  • Study for Quiz 2 (data structures) on Thursday
  • Push your work to GitHub

Next session: Quiz 2 + File I/O

global styles