Best Practices¶

This guide provides recommendations for effectively using pollywog in production geological modeling and resource estimation workflows.

Tip

Quick Checklist

✅ Use version control (Git) for all scripts (or at least keep a copy of the notebook in Central)

✅ Add comments and docstrings to your code

✅ Test calculations on small datasets first

✅ Use configuration files for parameters

✅ Follow consistent naming conventions

✅ Validate results against known values

✅ Document assumptions and data sources

Code Organization¶

Project Structure¶

Organize your pollywog scripts in a consistent directory structure:

project/
├── scripts/
│   ├── 01_drillhole_preprocessing.py
│   ├── 02_block_postprocessing.py
│   ├── 03_geometallurgy.py
│   └── 04_economics.py
├── outputs/
│   ├── drillhole_preprocessing.lfcalc
│   ├── block_postprocessing.lfcalc
│   └── ...
├── config/
│   ├── parameters.py
│   └── thresholds.py
├── tests/
│   └── test_calculations.py
└── README.md

You could also create an orchestration script (e.g., run_all.py) to execute all steps in sequence, or if using jupyter notebooks instead of scripts, use Papermill to parameterize and run them.

Script Organization¶

Structure your scripts consistently:

"""
Drillhole Preprocessing
=======================

Purpose: Clean and transform drillhole assay data before estimation
Author: Your Name
Date: 2024-01-15
Updated: 2024-03-20

Inputs:
- Au, Ag, Cu: Raw assay grades

Outputs:
- Au_clean, Ag_clean, Cu_clean: Cleaned grades
- Au_log, Ag_log, Cu_log: Log-transformed grades for kriging
"""

from pollywog.core import CalcSet, Number

# Configuration
OUTLIER_THRESHOLDS = {
    "Au": 100,  # g/t
    "Ag": 500,  # g/t
    "Cu": 5,    # %
}

EPSILON = 1e-6  # For log transforms

# Main calculation set
def create_preprocessing_calcset(metals=None, thresholds=None):
    """
    Create preprocessing calculations for drillhole data.

    Args:
        metals: List of metal names (default: ["Au", "Ag", "Cu"])
        thresholds: Dict of outlier thresholds (default: OUTLIER_THRESHOLDS)

    Returns:
        CalcSet ready to export
    """
    if metals is None:
        metals = ["Au", "Ag", "Cu"]
    if thresholds is None:
        thresholds = OUTLIER_THRESHOLDS

    calcs = []

    # Clean data
    for metal in metals:
        calcs.append(Number(
            name=f"{metal}_clean",
            expression=[f"clamp([{metal}], 0, {thresholds[metal]})"],
            comment_equation=f"Remove negatives and cap at {thresholds[metal]}"
        ))

    # Log transforms
    for metal in metals:
        calcs.append(Number(
            name=f"{metal}_log",
            expression=[f"log([{metal}_clean] + {EPSILON})"],
            comment_equation="Log transform for kriging"
        ))

    return CalcSet(calcs)

if __name__ == "__main__":
    # Create and export
    calcset = create_preprocessing_calcset()
    calcset.to_lfcalc("outputs/drillhole_preprocessing.lfcalc")
    print(f"Exported {len(calcset.items)} calculations")

Naming Conventions¶

Variables¶

Use descriptive, consistent names:

# Good
Au_estimated_kriging
Cu_recovered_payable
domain_geological
nsr_breakeven_cutoff

# Bad
au1
x
temp
calc

Follow these patterns:

Metal grades: Au_est, Cu_final, Ag_recovered
Transformed grades: Au_log, Cu_sqrt, Au_normalized
Domain/category: domain_geo, rocktype, alteration_zone
Economic: nsr, revenue_per_tonne, cutoff_grade
QA/QC: flag_negative, flag_outlier, qa_status
Intermediate: Au_step1, temp_calculation (minimize these)

Calculation Sets¶

Name your .lfcalc files clearly:

# Good
drillhole_preprocessing.lfcalc
block_postprocessing_domain_weighted.lfcalc
geometallurgy_recovery_model.lfcalc
economics_nsr_calculation.lfcalc

# Bad
calcs.lfcalc
output.lfcalc
final.lfcalc

Include context in the filename:

Stage: drillhole, block, mesh
Purpose: preprocessing, postprocessing, qa_qc
Method: domain_weighted, ml_predicted
Version: Optional date or version number

Data Quality and Validation¶

Input Validation¶

Always validate and clean input data:

from pollywog.core import CalcSet, Number, If

# Remove negative values
Number(name="Au_positive", expression="clamp([Au], 0)")

# Cap extreme outliers
Number(name="Au_capped", expression="clamp([Au], 0, 100)")

# Handle missing/blank values using Leapfrog's is_normal function
Number(name="Au_default", expression=[
    If("not is_normal([Au])", "0.001", "[Au]")
], comment_equation="Use 0.001 for blank/special values")

Range Checking¶

Create flags for out-of-range values:

from pollywog.core import CalcSet, Number, If

qa_checks = CalcSet([
    # Flag impossible values
    Number(name="flag_impossible", expression=[
        If("([Au] < 0) or ([Cu] < 0) or ([density] < 0)", "1", "0")
    ]),

    # Flag extreme values for review
    Number(name="flag_extreme", expression=[
        If("([Au] > 100) or ([Cu] > 10)", "1", "0")
    ]),

    # Flag missing critical data
    Number(name="flag_incomplete", expression=[
        If("([domain] = '') or (not is_normal([density]))", "1", "0")
    ]),
])

Avoiding Common Errors¶

Division by Zero¶

Always protect against division by zero:

# Bad
Number(name="ratio", expression=["[numerator] / [denominator]"])

# Good - add small epsilon
Number(name="ratio", expression=["[numerator] / ([denominator] + 1e-10)"])

# Good - use conditional
Number(name="ratio", expression=[
    If("[denominator] != 0", "[numerator] / [denominator]", "0")
])

# Good - clamp denominator
Number(name="ratio", expression=["[numerator] / clamp([denominator], 0.001)"])

Logarithms of Zero/Negative¶

Add epsilon before taking logarithms:

# Bad
Number(name="Au_log", expression=["log([Au])"])

# Good
Number(name="Au_log", expression=["log([Au] + 1e-6)"])

# Good - clamp first
Number(name="Au_log", expression=["log(clamp([Au], 1e-6))"])

Expression Complexity¶

Break complex expressions into steps:

# Bad - hard to read and debug
Number(name="value", expression=[
    "(([Au] * 1800 / 31.1035 * 0.88) + ([Cu] * 3.5 * 22.046 * 0.85)) * [tonnes] - ([mining_cost] + [processing_cost])"
])

# Good - break into logical steps
CalcSet([
    Number(name="Au_value_per_t", expression=["[Au] * 1800 / 31.1035 * 0.88"]),
    Number(name="Cu_value_per_t", expression=["[Cu] * 3.5 * 22.046 * 0.85"]),
    Number(name="revenue_per_t", expression=["[Au_value_per_t] + [Cu_value_per_t]"]),
    Number(name="total_cost", expression=["[mining_cost] + [processing_cost]"]),
    Number(name="nsr", expression=["[revenue_per_t] - [total_cost]"]),
    Number(name="block_value", expression=["[nsr] * [tonnes]"]),
])

Parentheses¶

Use parentheses liberally for clarity:

# Ambiguous
Number(name="result", expression=["[a] + [b] * [c] / [d]"])

# Clear
Number(name="result", expression=["[a] + (([b] * [c]) / [d])"])

Documentation and Comments¶

Code Comments¶

Document your intent:

from pollywog.core import CalcSet, Number

# Create domain-weighted grades
# Assumption: prop_oxide + prop_sulfide + prop_transition may be < 1 (waste not estimated)
# The weighted average automatically normalizes by sum of proportions
calcset = CalcSet([
    WeightedAverage(
        variables=["Au_oxide", "Au_sulfide", "Au_transition"],
        weights=["prop_oxide", "prop_sulfide", "prop_transition"],
        "Au_composite",
        comment="Domain-weighted Au grade, normalized by proportion sum"
    ),
])

Calculation Comments¶

Use comment_equation for business rules:

Number(
    "Au_recovered",
    "[Au_diluted] * 0.88",
    comment_equation="88% recovery per metallurgical test work (Report XYZ-2023)"
)

Number(
    "cutoff_grade",
    "0.3",
    comment_equation="Economic cutoff at $1800/oz Au, $3.50/lb Cu (Jan 2024 prices)"
)

README Documentation¶

Create a README for your project:

# Project Name - Resource Estimation Calculations

## Overview
Automated calculation sets for [Project Name] resource estimation.

## Workflow
1. Drillhole preprocessing: `01_drillhole_preprocessing.py`
2. Block postprocessing: `02_block_postprocessing.py`
3. Geometallurgy: `03_geometallurgy.py`
4. Economics: `04_economics.py`

## Key Assumptions
- Gold price: $1800/oz
- Copper price: $3.50/lb
- Gold recovery: 88%
- Copper recovery: 85%
- Dilution: 5%

## Dependencies
- Python 3.8+
- pollywog 0.1.2+
- scikit-learn (for ML models)

## Usage
```bash
python scripts/01_drillhole_preprocessing.py
# Import outputs/drillhole_preprocessing.lfcalc into Leapfrog
# Run estimation in Leapfrog
python scripts/02_block_postprocessing.py
```

Version Control¶

Git Best Practices¶

Use version control for all pollywog scripts:

# Initialize repository
git init
git add scripts/ config/ README.md
git commit -m "Initial commit - resource estimation calculations"

# Create .gitignore
echo "*.lfcalc" >> .gitignore  # Optional: exclude generated files
echo "__pycache__/" >> .gitignore
echo "*.pyc" >> .gitignore

Commit Messages¶

Write clear commit messages:

# Good
git commit -m "Update Au outlier threshold from 50 to 100 g/t"
git commit -m "Add copper recovery model from metallurgical tests"
git commit -m "Fix division by zero in NSR calculation"

# Bad
git commit -m "Update"
git commit -m "Fix bug"
git commit -m "Changes"

Configuration Management¶

External Configuration¶

Store parameters separately from code:

# config/parameters.py
METAL_PRICES = {
    "Au": 1800,  # $/oz
    "Ag": 24,    # $/oz
    "Cu": 3.50,  # $/lb
}

RECOVERIES = {
    "Au": 0.88,
    "Ag": 0.75,
    "Cu": 0.85,
}

OUTLIER_CAPS = {
    "Au": 100,  # g/t
    "Ag": 500,  # g/t
    "Cu": 5,    # %
}

DILUTION_FACTOR = 0.95

# scripts/02_block_postprocessing.py
from config.parameters import METAL_PRICES, RECOVERIES, DILUTION_FACTOR
from pollywog.core import CalcSet, Number

calcset = CalcSet([
    Number(name="Au_diluted", expression=[f"[Au_est] * {DILUTION_FACTOR}"]),
    Number(name="Au_recovered", expression=[f"[Au_diluted] * {RECOVERIES['Au']}"]),
])

Environment-Specific Settings¶

Support different environments (dev, prod):

import os
from pathlib import Path

# Determine environment
ENV = os.getenv("LEAPFROG_ENV", "development")

# Set paths based on environment
if ENV == "production":
    OUTPUT_DIR = Path("/shared/leapfrog/calculations")
else:
    OUTPUT_DIR = Path("./outputs")

OUTPUT_DIR.mkdir(exist_ok=True)

# Export to appropriate location
calcset.to_lfcalc(OUTPUT_DIR / "preprocessing.lfcalc")

Testing and Validation¶

Unit Testing Calculations¶

Test your calculation logic:

# tests/test_calculations.py
import pytest
from pollywog.core import CalcSet, Number
from pollywog.run import run_calcset

def test_nsr_calculation():
    """Test NSR calculation with known inputs."""
    calcset = CalcSet([
        Number(name="revenue", expression=["[grade] * [price]"]),
        Number(name="cost", expression=["35"]),
        Number(name="nsr", expression=["[revenue] - [cost]"]),
    ])

    # Test with known values
    result = run_calcset(calcset, inputs={"grade": 2.0, "price": 50})

    assert result["revenue"] == 100
    assert result["cost"] == 35
    assert result["nsr"] == 65

def test_domain_weighting():
    """Test weighted average calculation."""
    from pollywog.helpers import WeightedAverage

    calcset = CalcSet([
        WeightedAverage(
            variables=["Au_oxide", "Au_sulfide"],
            weights=["prop_oxide", "prop_sulfide"],
            name="Au_composite"
        )
    ])

    result = run_calcset(calcset, inputs={
        "Au_oxide": 1.5,
        "Au_sulfide": 0.8,
        "prop_oxide": 0.3,
        "prop_sulfide": 0.7,
    })

    expected = (1.5 * 0.3 + 0.8 * 0.7) / (0.3 + 0.7)
    assert abs(result["Au_composite"] - expected) < 0.001

Validation Against Leapfrog¶

Export small test cases and validate in Leapfrog:

# Create simple test case
test_calcset = CalcSet([
    Number(name="test_sum", expression=["[a] + [b]"]),
    Number(name="test_product", expression=["[a] * [b]"]),
])

test_calcset.to_lfcalc("test_calculations.lfcalc")

# Import into Leapfrog with known values (a=2, b=3)
# Verify test_sum = 5, test_product = 6

Performance Considerations¶

Minimize Calculations¶

Avoid redundant calculations:

# Bad - calculates Au + Ag twice
CalcSet([
    Number(name="sum_scaled", expression=["([Au] + [Ag]) * 2"]),
    Number(name="sum_offset", expression=["([Au] + [Ag]) + 10"]),
])

# Good - calculate once, reuse
CalcSet([
    Number(name="sum_Au_Ag", expression=["[Au] + [Ag]"]),
    Number(name="sum_scaled", expression=["[sum_Au_Ag] * 2"]),
    Number(name="sum_offset", expression=["[sum_Au_Ag] + 10"]),
])

Topological Sorting¶

Ensure correct calculation order:

from pollywog.core import CalcSet, Number

# Create calculations (order doesn't matter)
calcset = CalcSet([
    Number(name="final", expression=["[intermediate] * 2"]),
    Number(name="intermediate", expression=["[Au] + [Ag]"]),
    Number(name="Au", expression=["clamp([raw_Au], 0)"]),
])

# Sort by dependencies before exporting
sorted_calcset = calcset.topological_sort()
sorted_calcset.to_lfcalc("properly_ordered.lfcalc")

Common Pitfalls to Avoid¶

Hardcoding Values: Use configuration files for parameters that may change
Missing Back-transforms: Remember to back-transform after log/sqrt estimation
Ignoring Units: Keep track of units (g/t, %, oz/t, etc.) in comments
No Version Control: Always use Git for calculation scripts
Insufficient Testing: Test edge cases (zero, negative, very large values)
Poor Documentation: Future you will thank present you for good comments
Complex Single Scripts: Break large workflows into logical modules
No Validation: Always validate against manual calculations or Leapfrog

Workflow Checklist¶

Before deploying calculations to production:

☐ Code is organized and well-structured
☐ Variable names are descriptive and consistent
☐ All hardcoded values are moved to configuration
☐ Edge cases are handled (div by zero, log of zero, etc.)
☐ Comments explain business logic and assumptions
☐ Unit tests validate calculation logic
☐ Results validated against manual calculations
☐ Code is in version control with clear commit history
☐ README documents workflow and assumptions
☐ Dependencies are documented (Python version, packages)