Ruby and machine learning. Not the first pairing that comes to mind. Python dominates the ML space for good reason. But sometimes you need to add ML capabilities to an existing Ruby app, or you just want to prototype something fast without context-switching languages.
That's where this guide comes in.
Why Ruby for ML?
Look, nobody's saying you should build the next GPT in Ruby. But Ruby has some real advantages for certain ML use cases:
- Rapid prototyping. Ruby's syntax is clean. You can get a working model running in minutes.
- Existing infrastructure. Your Rails app already exists. Adding an external Python service means more moving parts.
- Readable code. When non-ML developers need to understand what's happening, Ruby wins.
The tradeoff? Fewer libraries, smaller community, slower execution. Know when it makes sense.
Getting Started with ruby-fann
The ruby-fann gem wraps the Fast Artificial Neural Network (FANN) library. It's not TensorFlow, but it handles basic neural network tasks without the Python dependency.
Install it:
gem install ruby-fannIf you hit compilation issues on macOS, you might need to install the FANN library first:
brew install fannYour First Neural Network: XOR Classification
The classic starting point. XOR is a non-linear problem that can't be solved with a simple perceptron. Perfect for proving your network actually works.
require 'ruby-fann'
# XOR truth table
# Inputs -> Expected Output
# [0, 0] -> 0
# [0, 1] -> 1
# [1, 0] -> 1
# [1, 1] -> 0
train_data = [
[[0.0, 0.0], [0.0]],
[[0.0, 1.0], [1.0]],
[[1.0, 0.0], [1.0]],
[[1.0, 1.0], [0.0]]
]
# Network architecture:
# - 2 input neurons (our two binary inputs)
# - 3 neurons in hidden layer (enough to learn XOR)
# - 1 output neuron (our classification result)
fann = RubyFann::Standard.new(
num_inputs: 2,
hidden_neurons: [3],
num_outputs: 1
)
# Train: 1000 epochs max, report every 10, target MSE of 0.1
fann.train_on_data(train_data, 1000, 10, 0.1)
# Test it
puts fann.run([0.0, 0.0]).inspect # Should be close to 0
puts fann.run([0.0, 1.0]).inspect # Should be close to 1
puts fann.run([1.0, 0.0]).inspect # Should be close to 1
puts fann.run([1.0, 1.0]).inspect # Should be close to 0The output won't be exactly 0 or 1. Neural networks produce continuous values. You'll see something like [0.03] or [0.97]. Threshold at 0.5 for binary classification.
Understanding the Parameters
The train_on_data method takes four arguments:
- Training data - Array of
[[inputs], [outputs]]pairs - Max epochs - How many times to iterate through all training data
- Epochs between reports - Console output frequency (set to 0 to silence)
- Desired MSE - Mean Squared Error target. Training stops when reached.
Lower MSE means better fit. But watch out for overfitting on small datasets.
A More Practical Example: Simple Spam Detection
XOR is cute, but here's something closer to real work. A basic text classifier that distinguishes spam from ham based on simple features.
require 'ruby-fann'
# Feature extraction: simple word counts
def extract_features(text)
text = text.downcase
[
text.include?('free') ? 1.0 : 0.0,
text.include?('winner') ? 1.0 : 0.0,
text.include?('click') ? 1.0 : 0.0,
text.include?('buy') ? 1.0 : 0.0,
text.include?('meeting') ? 1.0 : 0.0,
text.include?('project') ? 1.0 : 0.0,
text.include?('attached') ? 1.0 : 0.0,
text.split.length / 50.0 # Normalized word count
]
end
# Training data: [features, [is_spam]]
training_messages = [
["FREE WINNER! Click now to claim your prize!", [1.0]],
["Meeting tomorrow at 3pm to discuss the project", [0.0]],
["Buy now! Limited time offer!", [1.0]],
["Please review the attached document", [0.0]],
["You're a winner! Free gift inside!", [1.0]],
["Project update: milestone completed", [0.0]],
["Click here for free stuff!", [1.0]],
["Attached is the quarterly report", [0.0]]
]
train_data = training_messages.map do |text, label|
[extract_features(text), label]
end
# 8 input features, 5 hidden neurons, 1 output
fann = RubyFann::Standard.new(
num_inputs: 8,
hidden_neurons: [5],
num_outputs: 1
)
fann.train_on_data(train_data, 5000, 500, 0.01)
# Test new messages
test_messages = [
"Free money! Click to win!",
"See you at the meeting tomorrow",
"Buy our product now!"
]
test_messages.each do |msg|
score = fann.run(extract_features(msg)).first
label = score > 0.5 ? "SPAM" : "HAM"
puts "#{label} (#{score.round(3)}): #{msg}"
endThis is a toy example. Real spam detection needs way more features and training data. But it shows the pattern.
Saving and Loading Models
Training takes time. Save your models.
# Save after training
fann.save('spam_detector.net')
# Load later
loaded_fann = RubyFann::Standard.new(filename: 'spam_detector.net')
result = loaded_fann.run(some_input)Tips for Better Results
Normalize your inputs. Neural networks work best with values between 0 and 1, or -1 and 1. Raw numbers like [1500, 0.02, 891234] will cause problems.
Add more hidden neurons carefully. Start small. A network with too many neurons overfits quickly and trains slowly.
Multiple hidden layers for complex problems:
fann = RubyFann::Standard.new(
num_inputs: 10,
hidden_neurons: [8, 6, 4], # Three hidden layers
num_outputs: 2
)Shuffle your training data. Neural networks can get stuck if they see all examples of one class, then all examples of another.
When Ruby ML Actually Makes Sense
Good use cases:
- Adding simple prediction to existing Rails apps
- Prototyping before rewriting in Python
- Educational purposes and learning neural network basics
- Small-scale classification in data pipelines
Bad use cases:
- Training large models on big datasets
- Computer vision or NLP (use Python)
- Production systems requiring high accuracy
- Anything needing GPU acceleration
Alternative Ruby ML Libraries
ruby-fann isn't your only option:
- Rumale - Scikit-learn inspired library with many algorithms
- TensorStream - TensorFlow-like API (more experimental)
- liblinear-ruby - Linear classifiers, fast for large datasets
- rb-libsvm - Support Vector Machines
For most projects, Rumale is the modern choice. ruby-fann is simpler for neural network basics.
Wrapping Up
Ruby's ML ecosystem is limited compared to Python's. That's just reality. But for the right use cases, it works. You can prototype fast, integrate with existing Ruby code, and avoid the overhead of spinning up Python services.
Start with simple problems. Get comfortable with the basics. And know when to reach for Python instead.
The code here should give you a foundation. The rest is experimentation.