Python Basics: Variables, Data Types, Loops, and Functions Explained

Python is one of the most popular programming languages in the world — and for good reason. It’s readable, powerful, and used everywhere from data engineering to web development to AI. If you’re just starting out, this guide covers the Python basics you need to write real, working code from day one.

Why Python? A Quick Case for Learning It

Readable syntax — Python code reads almost like plain English
Versatile — used in data engineering, automation, ML, web apps, scripting
Huge ecosystem — thousands of libraries (pandas, boto3, pyspark, fastapi…)
In-demand — consistently top 3 in developer surveys worldwide

Setting Up Python

Install Python from python.org (3.10+ recommended). Verify it works:

python3 --version
# Python 3.11.4

For an editor, VS Code with the Python extension is a solid choice.

Variables and Assignment

A variable stores a value. In Python, you don’t declare types — just assign:

name = "Ruby"
age = 30
is_engineer = True
pi = 3.14159

Python figures out the type automatically. You can reassign variables freely:

x = 10
x = "now a string"  # totally valid in Python

Python Data Types

Python has several built-in data types you will use constantly:

Type	Example	Notes
`int`	`42`	Whole numbers
`float`	`3.14`	Decimal numbers
`str`	`"hello"`	Text, in quotes
`bool`	`True / False`	Booleans
`list`	`[1, 2, 3]`	Ordered, mutable
`tuple`	`(1, 2, 3)`	Ordered, immutable
`dict`	`{"key": "value"}`	Key-value pairs
`set`	`{1, 2, 3}`	Unique values

type(42)        # <class 'int'>
type("hello")   # <class 'str'>
type([1,2,3])   # <class 'list'>

Strings: The Basics

name = "Ruby"

# Concatenation
greeting = "Hello, " + name  # "Hello, Ruby"

# f-strings (preferred in modern Python)
greeting = f"Hello, {name}!"  # "Hello, Ruby!"

# Useful string methods
"python".upper()       # "PYTHON"
"  hello  ".strip()    # "hello"
"a,b,c".split(",")     # ["a", "b", "c"]
len("hello")           # 5

Lists: Ordered Collections

tools = ["Python", "SQL", "Spark"]

tools[0]   # "Python"
tools[-1]  # "Spark"

tools.append("dbt")
tools.remove("SQL")

tools[0:2]   # ["Python", "Spark"]

for tool in tools:
    print(tool)

Dictionaries: Key-Value Storage

engineer = {
    "name": "Ruby",
    "skills": ["Python", "AWS", "Spark"],
    "years_experience": 5
}

engineer["name"]          # "Ruby"
engineer.get("location")  # None (safe)

engineer["location"] = "London"

for key, value in engineer.items():
    print(f"{key}: {value}")

Control Flow: if / elif / else

score = 85

if score >= 90:
    print("Excellent")
elif score >= 70:
    print("Good")
else:
    print("Needs work")

Python uses indentation (4 spaces) to define code blocks — no curly braces.

Loops

for loop

numbers = [1, 2, 3, 4, 5]

for n in numbers:
    print(n * 2)

for i in range(5):
    print(i)   # 0 1 2 3 4

while loop

count = 0
while count < 3:
    print(count)
    count += 1

List Comprehensions (Pythonic Shorthand)

squares = [x**2 for x in range(10)]
# [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Functions

def greet(name):
    return f"Hello, {name}!"

greet("Ruby")  # "Hello, Ruby!"

Default Arguments

def connect(host, port=5432):
    return f"Connecting to {host}:{port}"

connect("localhost")        # uses port 5432
connect("prod-db", 3306)    # overrides port

*args and **kwargs

def log(*args, **kwargs):
    print(args)
    print(kwargs)

log("error", "timeout", level="WARN", service="api")

Error Handling

try:
    result = 10 / 0
except ZeroDivisionError as e:
    print(f"Error: {e}")
finally:
    print("Always runs")

Importing Modules

import os
import json
from datetime import datetime

# Third-party (install with pip)
import pandas as pd
import boto3

pip install pandas boto3

A Mini Project: Putting It All Together

def summarise_pipeline(jobs):
    # Summarise a list of pipeline job results.
    total = len(jobs)
    succeeded = sum(1 for j in jobs if j["status"] == "success")
    failed = total - succeeded

    return {
        "total": total,
        "succeeded": succeeded,
        "failed": failed,
        "success_rate": f"{(succeeded / total) * 100:.1f}%"
    }

jobs = [
    {"name": "ingest_s3", "status": "success"},
    {"name": "transform_spark", "status": "success"},
    {"name": "load_redshift", "status": "failed"},
]

print(summarise_pipeline(jobs))
# {'total': 3, 'succeeded': 2, 'failed': 1, 'success_rate': '66.7%'}

FAQ: Python Basics

Is Python good for data engineering?

Yes — Python is the primary language for data engineering. Libraries like PySpark, pandas, SQLAlchemy, and boto3 are built around it.

Do I need to understand data types before writing Python?

Yes, even at a basics level. Knowing the difference between a list and a dict will save you hours of debugging.

What is the difference between a list and a tuple in Python?

A list is mutable (you can change it); a tuple is immutable (fixed after creation). Use tuples for data that should not change — like coordinates or config pairs.

What is an f-string in Python?

An f-string (formatted string literal) is the modern way to embed variables in strings. Prefix the string with f and wrap variables in {}: f"Hello, {name}".

How is Python different from other languages?

Python uses indentation for code blocks (not braces), has dynamic typing, and prioritises readability. It’s slower than C/Java but faster to write and maintain.

What should I learn after Python basics?

Focus on: file I/O, classes and OOP, exception handling patterns, then pick a domain — data engineering (pandas, PySpark), web (FastAPI), or automation (boto3, subprocess).

Wrapping Up

Python basics aren’t just for beginners — they’re the foundation every data engineer, ML engineer, and backend developer returns to. Nail variables, data types, loops, functions, and error handling, and you’ll be writing real, useful code quickly.

The best way to learn? Write code. Break things. Read error messages. Repeat.