Elixir Testing Patterns: Test Data with ExMachina (Part 3)

In Parts 1 and 2, you learned ExUnit fundamentals and database isolation with Ecto.Sandbox. Now you can write tests that run in parallel without conflicts. But there’s still a pain point: creating test data. This is Part 3 of our 7-part series on Elixir testing patterns, where we solve the test data problem with factory patterns.

Hard-coded test data gets brittle fast. You write %{title: "Test Todo", description: "Test description", due_date: ~D[2025-12-31]} in one test, copy it to another, then realize you need to add a new required field. Now you’re updating 50 test files. Factory patterns eliminate this pain.

This post corresponds to PR #4: Factory Patterns with ExMachina & Faker in ex-test. You’ll see exactly how to build flexible, composable test data that adapts as your schema evolves.

Why Factories Beat Fixtures

Let’s start with the problem.

The Problem with Static Fixtures

Remember the module attributes pattern from Part 1? It works well for small test suites:

defmodule ExTest.TodosTest do
  use ExTest.DataCase, async: true
 
  @valid_attrs %{
    title: "Test Todo",
    description: "Test description",
    completed: false,
    due_date: ~D[2025-12-31]
  }
 
  test "creates todo with valid attributes" do
    assert {:ok, todo} = Todos.create_todo(@valid_attrs)
  end
end

This pattern breaks down as your application grows. Here’s why:

Rigidity: Every test uses the exact same data. You can’t easily create variations without duplicating the entire attribute map.

Manual Management: Need a todo with a past due date? Create @overdue_attrs. Need a completed todo? Create @completed_attrs. Soon you have a dozen attribute maps at the top of every test file.

Schema Changes: Add a required field to your schema and suddenly dozens of tests fail because the fixtures are incomplete.

No Randomness: Static data means every test run is identical. You won’t catch edge cases that only appear with certain values.

Database Keys: Tests that insert data can conflict. What if two tests use the same email address and you have a unique constraint?

What Makes Factories Better

Factory patterns solve these problems through functions that build data structures. Instead of static maps, you have functions that generate fresh data on demand.

ExMachina provides the factory pattern for Elixir. Here’s the same test with a factory:

test "creates todo with valid attributes" do
  attrs = params_for(:todo)
  assert {:ok, todo} = Todos.create_todo(attrs)
end

What changed? Instead of @valid_attrs, we call params_for(:todo). This function returns a map of attributes, but each time you call it, the data is fresh and unique.

The real power shows when you need variations:

test "lists only completed todos" do
  insert(:todo, completed: true, title: "Done")
  insert(:todo, completed: true, title: "Also done")
  insert(:todo, completed: false, title: "Not done")
 
  completed = Todos.list_completed_todos()
  assert length(completed) == 2
end

The insert/2 function creates a todo with default factory values, then overrides specific fields. Want a completed todo? Pass completed: true. Want a custom title? Pass title: "Custom". The factory handles all the other required fields.

This is composition: start with sensible defaults, then customize only what matters for your test.

Getting Started with ExMachina

ExMachina is the standard factory library for Elixir. It’s inspired by FactoryBot (Ruby) but designed for Elixir’s patterns.

Installation and Setup

Add ExMachina to your dependencies in mix.exs:

defp deps do
  [
    {:ex_machina, "~> 2.7", only: :test},
    {:faker, "~> 0.18", only: :test}
  ]
end

We’re also adding Faker, which generates realistic random data. More on that shortly.

Install dependencies:

mix deps.get

Your First Factory

Create a factory module at test/support/factory.ex:

defmodule ExTest.Factory do
  use ExMachina.Ecto, repo: ExTest.Repo
 
  alias ExTest.Todos.Todo
 
  def todo_factory do
    %Todo{
      title: "Sample Todo",
      description: "A sample description",
      completed: false,
      due_date: Date.add(Date.utc_today(), 7)
    }
  end
end

Breaking this down:

use ExMachina.Ecto: Imports ExMachina’s Ecto adapter, which adds database-aware functions like insert/2.

repo: ExTest.Repo: Tells ExMachina which Repo to use for database operations.

def todo_factory: Factory functions follow the pattern [name]_factory. ExMachina uses the function name to determine which factory to use.

Return value: The factory returns a struct (not a changeset or params map). ExMachina will handle converting it as needed.

Now import the factory in your DataCase (we’ve already done this in ex-test):

defmodule ExTest.DataCase do
  use ExUnit.CaseTemplate
 
  using do
    quote do
      alias ExTest.Repo
      import Ecto
      import Ecto.Changeset
      import Ecto.Query
      import ExTest.DataCase
      import ExTest.Factory  # Add this line
    end
  end
 
  # ... rest of DataCase
end

Now every test that uses DataCase has access to factory functions.

The Core API: build, insert, params_for

ExMachina provides three essential functions for working with factories.

build - In-Memory Structs

The build/2 function creates a struct in memory without touching the database:

describe "build vs insert" do
  test "build creates in-memory struct without database" do
    todo = build(:todo)
    assert %Todo{} = todo
    assert todo.id == nil  # Not persisted, no ID
  end
end

Use build/2 when you need a struct for testing validations or functions that don’t require persisted data:

test "changeset validates title presence" do
  todo = build(:todo, title: nil)
  changeset = Todo.changeset(todo, %{})
  refute changeset.valid?
end

Building is fast because it skips the database entirely. For tests that don’t need persistence, build/2 gives you speed.

You can override factory defaults:

test "build with custom attributes" do
  todo = build(:todo, title: "Custom Title", completed: true)
  assert todo.title == "Custom Title"
  assert todo.completed == true
  assert todo.description != nil  # Factory provides this
end

The factory fills in all fields, you only override what matters for your test.

insert - Persisted Records

The insert/2 function creates a struct and saves it to the database:

test "insert persists to database" do
  todo = insert(:todo)
  assert todo.id != nil  # Has an ID from database
 
  # Can retrieve from database
  assert Todos.get_todo!(todo.id) == todo
end

Remember from Part 2: these inserts happen inside a transaction that rolls back when the test completes. The todo only exists for the duration of the test.

Use insert/2 when your test needs persisted data:

test "update_todo modifies existing todo" do
  todo = insert(:todo, title: "Original")
 
  {:ok, updated} = Todos.update_todo(todo, %{title: "Updated"})
 
  assert updated.title == "Updated"
  assert updated.id == todo.id  # Same record
end

Like build/2, you can override defaults:

test "creates overdue todo" do
  past_date = Date.add(Date.utc_today(), -5)
  todo = insert(:todo, due_date: past_date, completed: false)
 
  assert Date.compare(todo.due_date, Date.utc_today()) == :lt
end

params_for - Attribute Maps

The params_for/2 function returns a plain map of attributes, not a struct:

describe "params_for" do
  test "returns a map of attributes (not a struct)" do
    params = params_for(:todo)
    assert is_map(params)
    refute is_struct(params)
  end
end

This is perfect for controller tests or testing context functions that accept attribute maps:

test "useful for controller/context tests" do
  params = params_for(:todo, title: "Params Test")
 
  assert {:ok, todo} = Todos.create_todo(params)
  assert todo.title == "Params Test"
end

Why not just use build/2 and convert to a map? Because params_for/2 handles associations and transformations correctly. It’s designed specifically for this use case.

The pattern:

build/2 → struct for in-memory operations
insert/2 → persisted struct for database operations
params_for/2 → attribute map for create/update operations

Realistic Data with Faker

Static factory data is better than hard-coded fixtures, but it’s still static. Every test run uses the same values. Faker generates random realistic data, making each test run unique.

Common Faker Functions

Here’s a factory using Faker:

def todo_factory do
  %Todo{
    title: Faker.Lorem.sentence(3..5),
    description: Faker.Lorem.paragraph(2..4),
    completed: false,
    due_date: Date.add(Date.utc_today(), Enum.random(1..30))
  }
end

Now every build(:todo) creates a todo with a different title and description. The title is a 3-5 word sentence, the description is a 2-4 sentence paragraph, and the due date is randomly 1-30 days in the future.

Common Faker modules:

Faker.Lorem: Text generation

Faker.Lorem.word()           # "lorem"
Faker.Lorem.sentence(3)      # "Lorem ipsum dolor."
Faker.Lorem.paragraph(2)     # "Lorem ipsum... Dolor sit..."

Faker.Person: Names and identities

Faker.Person.first_name()    # "John"
Faker.Person.last_name()     # "Smith"
Faker.Person.name()          # "John Smith"

Faker.Internet: Email and web data

Faker.Internet.email()       # "[email protected]"
Faker.Internet.url()         # "https://example.com/path"

Faker.Address: Location data

Faker.Address.city()         # "San Francisco"
Faker.Address.country()      # "United States"

Generating Dates and Numbers

Dates are tricky in tests. Hard-coded dates like ~D[2025-12-31] break over time. Faker provides better options:

Relative dates:

due_date: Date.add(Date.utc_today(), Enum.random(1..30))

This creates a due date 1-30 days in the future. The test works regardless of when you run it.

Random numbers:

priority: Enum.random(1..5)

Random booleans:

completed: Enum.random([true, false])

Or if you want a weighted random:

# 80% chance of false, 20% chance of true
completed: Enum.random([false, false, false, false, true])

Faker makes tests more robust by exercising different code paths on each run. A test that passes with “Todo 1” might fail with “A very long title that exceeds the maximum length constraint”. Randomness finds edge cases.

Sequences for Unique Values

Faker generates random data, but what about fields that must be unique? Email addresses, usernames, and other unique constraints need guaranteed uniqueness.

When You Need Uniqueness

Imagine a user factory:

def user_factory do
  %User{
    email: Faker.Internet.email(),
    username: Faker.Internet.user_name()
  }
end

This works until you create two users in the same test:

test "creates multiple users" do
  user1 = insert(:user)
  user2 = insert(:user)  # Might fail! Random email could collide
end

Faker’s randomness isn’t guaranteed to be unique. For a 100-test suite, you might see occasional failures when random values collide.

Named Sequences

Sequences solve this. They generate incrementing values guaranteed to be unique:

def todo_factory do
  %Todo{
    title: sequence(:title, &"Todo ##{&1}: #{Faker.Lorem.sentence(3)}"),
    description: Faker.Lorem.paragraph(2..4),
    completed: false,
    due_date: Date.add(Date.utc_today(), Enum.random(1..30))
  }
end

The sequence/2 function takes a name and a function. The function receives an integer that increments on each call:

build(:todo)  # title: "Todo #1: Lorem ipsum dolor."
build(:todo)  # title: "Todo #2: Sit amet consectetur."
build(:todo)  # title: "Todo #3: Adipiscing elit sed."

Each title is unique. The sequence counter increments globally across your test suite.

For simpler cases, you can use sequence without a function:

email: sequence(:email, &"user-#{&1}@example.com")

This generates:

[email protected]
[email protected]
[email protected]

Multiple sequences are independent:

def user_factory do
  %User{
    email: sequence(:email, &"user-#{&1}@example.com"),
    username: sequence(:username, &"user#{&1}")
  }
end

The :email sequence and :username sequence maintain separate counters.

Sequences are essential for fields with unique constraints. Use Faker for variety, sequences for uniqueness.

Factory Variants with struct!

Often you need variations of the same factory. A completed todo, an overdue todo, a minimal todo. You could create separate attribute maps, but there’s a better pattern.

Creating Trait Factories

ExMachina doesn’t have built-in “traits” like FactoryBot, but Elixir’s struct!/2 function provides the same capability:

def completed_todo_factory do
  struct!(todo_factory(), %{completed: true})
end
 
def overdue_todo_factory do
  struct!(todo_factory(), %{
    due_date: Date.add(Date.utc_today(), -Enum.random(1..30)),
    completed: false
  })
end
 
def minimal_todo_factory do
  %Todo{title: Faker.Lorem.sentence(2..3)}
end

Now you have multiple factory variants:

describe "factory variants (traits)" do
  test "completed_todo_factory creates completed todos" do
    todo = insert(:completed_todo)
    assert todo.completed == true
    assert todo.title != nil  # Still gets default title from todo_factory
  end
 
  test "overdue_todo_factory creates past-due todos" do
    todo = insert(:overdue_todo)
    assert Date.compare(todo.due_date, Date.utc_today()) == :lt
    assert todo.completed == false
  end
 
  test "minimal_todo_factory creates minimal valid todos" do
    todo = insert(:minimal_todo)
    assert todo.title != nil
    assert todo.description == nil  # Only has required fields
  end
end

The struct! Pattern

Why struct!/2 instead of Map.merge/2? Because struct!/2 enforces the struct type and validates keys:

# WRONG - Map.merge returns a map, not a struct
def completed_todo_factory do
  Map.merge(todo_factory(), %{completed: true})
end
 
# RIGHT - struct! returns a Todo struct
def completed_todo_factory do
  struct!(todo_factory(), %{completed: true})
end

The struct!/2 pattern also catches typos at compile time:

# This will raise at runtime because :completd is not a valid field
def completed_todo_factory do
  struct!(todo_factory(), %{completd: true})  # Typo!
end

Pattern for factory variants:

Call the base factory function
Use struct!/2 to override specific fields
Name the variant factory with _factory suffix

This keeps all your todo logic in one place. Change todo_factory and all variants inherit the changes.

Batch Creation

Tests often need multiple records. Instead of calling insert/2 repeatedly, use batch creation functions.

build_list and insert_list

ExMachina provides build_list/3 and insert_list/3:

describe "batch creation" do
  test "build_list creates multiple structs" do
    todos = build_list(5, :todo)
    assert length(todos) == 5
    assert Enum.all?(todos, &(is_struct(&1, Todo)))
  end
 
  test "insert_list creates multiple database records" do
    todos = insert_list(3, :todo, completed: true)
    assert length(todos) == 3
    assert Enum.all?(todos, &(&1.completed == true))
    assert Enum.all?(todos, &(&1.id != nil))
  end
end

The third argument lets you override attributes for all created records:

test "creates 10 completed todos" do
  todos = insert_list(10, :todo, completed: true)
  assert length(todos) == 10
  assert Enum.all?(todos, &(&1.completed == true))
end

Combined with sequences, this creates unique records in bulk:

test "creates todos with sequential titles" do
  todos = insert_list(3, :todo)
 
  # Each has a unique title from the sequence
  assert todos |> Enum.map(& &1.title) |> Enum.uniq() |> length() == 3
end

Real-World Use Cases

Batch creation shines in tests that need realistic data volumes:

test "pagination returns correct page" do
  # Create 25 todos
  insert_list(25, :todo)
 
  # Test first page
  page1 = Todos.list_todos(page: 1, per_page: 10)
  assert length(page1) == 10
 
  # Test third page
  page3 = Todos.list_todos(page: 3, per_page: 10)
  assert length(page3) == 5
end

Or testing filters:

test "filters by completion status" do
  # Create mix of completed and incomplete
  insert_list(7, :todo, completed: true)
  insert_list(3, :todo, completed: false)
 
  completed = Todos.list_completed_todos()
  incomplete = Todos.list_incomplete_todos()
 
  assert length(completed) == 7
  assert length(incomplete) == 3
end

Without batch creation, this would be 10 separate insert/2 calls. Verbose and slow.

Composed Factory Helpers

Sometimes you need more complex setup than a simple factory can provide. Factory helpers compose multiple factories into reusable setup functions.

When to Create Helpers

Consider this test setup:

test "shows statistics for mixed todo list" do
  # 5 completed todos
  insert(:todo, completed: true)
  insert(:todo, completed: true)
  insert(:todo, completed: true)
  insert(:todo, completed: true)
  insert(:todo, completed: true)
 
  # 3 incomplete todos
  insert(:todo, completed: false)
  insert(:todo, completed: false)
  insert(:todo, completed: false)
 
  # 2 overdue todos
  insert(:overdue_todo)
  insert(:overdue_todo)
 
  stats = Todos.get_statistics()
  assert stats.completed_count == 5
  assert stats.incomplete_count == 3
  assert stats.overdue_count == 2
end

This setup is verbose and error-prone. A factory helper cleans it up.

Complex Scenario Setup

Create a factory helper module at test/support/factory_helper.ex:

defmodule ExTest.FactoryHelper do
  alias ExTest.Factory
 
  def factory_completed_todo(attrs \\ %{}) do
    Factory.insert(:todo, Map.merge(%{completed: true}, attrs))
  end
 
  def factory_overdue_todo(attrs \\ %{}) do
    Factory.insert(:todo, Map.merge(%{
      due_date: Date.add(Date.utc_today(), -Enum.random(1..30)),
      completed: false
    }, attrs))
  end
 
  def factory_todos(count, attrs \\ %{}) do
    Factory.insert_list(count, :todo, attrs)
  end
 
  def factory_mixed_todos(opts \\ []) do
    completed_count = Keyword.get(opts, :completed, 3)
    incomplete_count = Keyword.get(opts, :incomplete, 3)
 
    completed = factory_todos(completed_count, %{completed: true})
    incomplete = factory_todos(incomplete_count, %{completed: false})
 
    completed ++ incomplete
  end
end

Import this in your DataCase:

using do
  quote do
    alias ExTest.Repo
    import Ecto
    import Ecto.Changeset
    import Ecto.Query
    import ExTest.DataCase
    import ExTest.Factory
    import ExTest.FactoryHelper  # Add this
  end
end

Now the test becomes:

test "shows statistics for mixed todo list" do
  todos = factory_mixed_todos(completed: 5, incomplete: 3)
  insert_list(2, :overdue_todo)
 
  stats = Todos.get_statistics()
  assert stats.completed_count == 5
  assert stats.incomplete_count == 3
  assert stats.overdue_count == 2
end

Much cleaner. The test focuses on what it’s testing (statistics calculation), not data setup.

Another helper example:

def factory_todo_with_reminders(todo_attrs \\ %{}, reminder_count \\ 3) do
  todo = Factory.insert(:todo, todo_attrs)
 
  reminders =
    Factory.insert_list(reminder_count, :reminder, %{
      todo_id: todo.id,
      scheduled_at: DateTime.add(DateTime.utc_now(), 3600, :second)
    })
 
  %{todo: todo, reminders: reminders}
end

Use it like:

test "sends all reminders for a todo" do
  %{todo: todo, reminders: reminders} = factory_todo_with_reminders()
 
  Todos.send_reminders(todo)
 
  # Verify all reminders were sent
  assert length(get_sent_emails()) == 3
end

Factory helpers encapsulate complex object graphs, making tests readable and maintainable.

Best Practices

Factory patterns are powerful but can be misused. Here’s how to use them effectively.

Factory Organization

One factory module: Keep all factories in test/support/factory.ex. Don’t split into multiple files unless you have dozens of schemas.

Factory per schema: Every Ecto schema should have a corresponding factory. This creates consistency.

Sensible defaults: Factory defaults should create valid records without overrides. Tests override only what’s relevant.

Use Faker wisely: Randomness is good for finding edge cases, but debugging random failures is painful. Use sequences for fields with constraints.

Variants vs overrides: Create factory variants (:completed_todo) for common patterns, use overrides (completed: true) for one-off tests.

Common Pitfalls

Over-building: Don’t create factories for every possible state. You’ll have 20 user factory variants nobody uses. Create variants as you need them.

Hidden dependencies: Factory helpers that create complex object graphs hide dependencies. Use them for common scenarios, not every test.

# GOOD - explicit about what's being created
test "updates todo" do
  todo = insert(:todo, title: "Original")
  {:ok, updated} = Todos.update_todo(todo, %{title: "Updated"})
  assert updated.title == "Updated"
end
 
# BAD - hides what setup_todo_with_user_and_project actually creates
test "updates todo" do
  %{todo: todo} = setup_todo_with_user_and_project()
  {:ok, updated} = Todos.update_todo(todo, %{title: "Updated"})
  assert updated.title == "Updated"
end

build vs insert confusion: Use build/2 for speed when possible, insert/2 when you need persistence. Don’t default to insert/2 everywhere.

Ignoring sequences: If you’re getting intermittent unique constraint violations, you need sequences. Don’t rely on Faker’s randomness for unique fields.

Overly DRY: Don’t extract every 2-line setup into a helper. Sometimes explicit is better than DRY.

Summary

You’ve mastered factory patterns for Elixir test data:

Why factories: Flexibility, composition, and resilience to schema changes
ExMachina API: build/2 for in-memory structs, insert/2 for persistence, params_for/2 for attribute maps
Faker integration: Realistic random data that catches edge cases
Sequences: Guaranteed unique values for constrained fields
Factory variants: Using struct!/2 to create specialized factories
Batch creation: build_list/3 and insert_list/3 for multiple records
Factory helpers: Composed functions for complex scenario setup

ex-test demonstrates these patterns across all tests. No more hard-coded attribute maps. No more brittle fixtures. Every test uses flexible, expressive factories.

In Part 4, we’ll tackle a different problem: testing code that depends on external services. Learn to use Mox for behavior-driven mocking with @callback contracts.

All code examples are available in the ex-test.

Previous: Part 2 - Database Isolation with Ecto.Sandbox

Next: Part 4 - Mocking with Mox

All Parts

ExUnit Fundamentals
Database Isolation with Ecto.Sandbox
Test Data with ExMachina (You are here)
Mocking with Mox
Adapter Pattern and Stubs
Centralized Test Helpers
Phoenix Controller Testing

Elixir Testing Patterns: Test Data with ExMachina (Part 3)

Why Factories Beat Fixtures

The Problem with Static Fixtures

What Makes Factories Better

Getting Started with ExMachina

Installation and Setup

Your First Factory

The Core API: build, insert, params_for

build - In-Memory Structs

insert - Persisted Records

params_for - Attribute Maps

Realistic Data with Faker

Common Faker Functions

Generating Dates and Numbers

Sequences for Unique Values

When You Need Uniqueness

Named Sequences

Factory Variants with struct!

Creating Trait Factories

The struct! Pattern

Batch Creation

build_list and insert_list

Real-World Use Cases

Composed Factory Helpers

When to Create Helpers

Complex Scenario Setup

Best Practices

Factory Organization

Common Pitfalls

Summary

Series Navigation

All Parts

Resources

Table of Contents

Backlinks

Recent blogs

Elixir Testing Patterns: Phoenix Controller Testing (Part 7)

Elixir Testing Patterns: Centralized Test Helpers (Part 6)

Elixir Testing Patterns: Adapter Pattern and Stubs (Part 5)

Elixir Testing Patterns: Mocking with Mox (Part 4)

Elixir Testing Patterns: Test Data with ExMachina (Part 3)