⚡

// Skill profile

Open Construction Estimate

Name: Open Construction Estimate
Author: datadrivenconstruction

name: "open-construction-estimate"

by datadrivenconstruction · published 2026-03-22

数据处理自动化任务加密货币

Total installs

Stars

★ 0

Last updated

2026-03

// Install command

$ claw add gh:datadrivenconstruction/datadrivenconstruction-open-construction-estimate

View on GitHub

// Full documentation

---

name: "open-construction-estimate"

description: "Access and utilize open construction pricing databases. Match BIM elements to standardized work items, calculate costs using public unit price databases with 55,000+ work items."

---

# Open Construction Estimate

Overview

This skill leverages open construction pricing databases for automated cost estimation. Match project elements to standardized work items and calculate costs using publicly available unit prices.

**Data Sources:**

OpenConstructionEstimate (55,000+ work items)

RSMeans Online (subscription)

Government pricing databases

Regional cost indexes

> "Открытые базы данных расценок содержат более 55,000 позиций работ, что позволяет автоматизировать сметные расчеты для большинства проектов."

> — DDC LinkedIn

Quick Start

import pandas as pd
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

# Load work items database
work_items = pd.read_csv("open_construction_estimate.csv")
print(f"Loaded {len(work_items)} work items")

# Simple matching function
vectorizer = TfidfVectorizer(ngram_range=(1, 2))
item_vectors = vectorizer.fit_transform(work_items['description'])

def find_matching_items(query, top_n=5):
    query_vec = vectorizer.transform([query])
    similarities = cosine_similarity(query_vec, item_vectors)[0]
    top_indices = similarities.argsort()[-top_n:][::-1]

    return work_items.iloc[top_indices][['code', 'description', 'unit', 'unit_price']]

# Find matches
matches = find_matching_items("reinforced concrete wall 300mm")
print(matches)

Open Database Structure

Database Schema

# Standard work items database structure
WORK_ITEMS_SCHEMA = {
    'code': 'Work item code (e.g., 03.31.13.13)',
    'description': 'Full description of work',
    'short_description': 'Abbreviated description',
    'unit': 'Unit of measure (m³, m², ton, pcs)',
    'unit_price': 'Base unit price',
    'labor_cost': 'Labor component per unit',
    'material_cost': 'Material component per unit',
    'equipment_cost': 'Equipment component per unit',
    'labor_hours': 'Labor hours per unit',
    'crew_size': 'Typical crew size',
    'productivity': 'Units per day',
    'category_l1': 'Primary category (CSI Division)',
    'category_l2': 'Secondary category',
    'category_l3': 'Detailed category',
    'region': 'Geographic region',
    'year': 'Price year',
    'source': 'Data source'
}

# CSI MasterFormat Divisions
CSI_DIVISIONS = {
    '03': 'Concrete',
    '04': 'Masonry',
    '05': 'Metals',
    '06': 'Wood, Plastics, Composites',
    '07': 'Thermal and Moisture Protection',
    '08': 'Openings',
    '09': 'Finishes',
    '10': 'Specialties',
    '21': 'Fire Suppression',
    '22': 'Plumbing',
    '23': 'HVAC',
    '26': 'Electrical',
    '31': 'Earthwork',
    '32': 'Exterior Improvements',
    '33': 'Utilities'
}

Work Item Matching Engine

Semantic Matching System

import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
from sentence_transformers import SentenceTransformer
from typing import List, Dict, Optional, Tuple
import re

class WorkItemMatcher:
    """Match BIM elements to standardized work items"""

    def __init__(self, database_path: str, use_embeddings: bool = True):
        self.db = pd.read_csv(database_path)

        # TF-IDF for fast initial filtering
        self.tfidf = TfidfVectorizer(
            ngram_range=(1, 3),
            max_features=10000,
            stop_words='english'
        )
        self.tfidf_matrix = self.tfidf.fit_transform(self.db['description'])

        # Sentence embeddings for semantic matching
        self.use_embeddings = use_embeddings
        if use_embeddings:
            self.embedder = SentenceTransformer('all-MiniLM-L6-v2')
            self.embeddings = self.embedder.encode(
                self.db['description'].tolist(),
                show_progress_bar=True
            )

    def match(self, query: str, top_n: int = 5,
              category: str = None) -> List[Dict]:
        """Find matching work items for a query"""
        # Filter by category if specified
        if category:
            mask = self.db['category_l1'].str.contains(category, case=False, na=False)
            search_db = self.db[mask]
            search_matrix = self.tfidf_matrix[mask]
        else:
            search_db = self.db
            search_matrix = self.tfidf_matrix

        if self.use_embeddings:
            return self._semantic_match(query, search_db, top_n)
        else:
            return self._tfidf_match(query, search_db, search_matrix, top_n)

    def _tfidf_match(self, query: str, db: pd.DataFrame,
                     matrix, top_n: int) -> List[Dict]:
        """TF-IDF based matching"""
        query_vec = self.tfidf.transform([query])
        similarities = cosine_similarity(query_vec, matrix)[0]

        top_indices = similarities.argsort()[-top_n:][::-1]

        results = []
        for idx in top_indices:
            row = db.iloc[idx]
            results.append({
                'code': row['code'],
                'description': row['description'],
                'unit': row['unit'],
                'unit_price': row['unit_price'],
                'similarity': float(similarities[idx]),
                'category': row.get('category_l1', '')
            })

        return results

    def _semantic_match(self, query: str, db: pd.DataFrame,
                        top_n: int) -> List[Dict]:
        """Semantic embedding based matching"""
        query_embedding = self.embedder.encode([query])

        # Get indices for filtered db
        indices = db.index.tolist()
        filtered_embeddings = self.embeddings[indices]

        similarities = cosine_similarity(query_embedding, filtered_embeddings)[0]
        top_indices = similarities.argsort()[-top_n:][::-1]

        results = []
        for i, idx in enumerate(top_indices):
            row = db.iloc[idx]
            results.append({
                'code': row['code'],
                'description': row['description'],
                'unit': row['unit'],
                'unit_price': row['unit_price'],
                'similarity': float(similarities[idx]),
                'category': row.get('category_l1', '')
            })

        return results

    def match_bim_element(self, element: Dict) -> List[Dict]:
        """Match a BIM element to work items"""
        # Build query from element properties
        query_parts = []

        if element.get('material'):
            query_parts.append(element['material'])
        if element.get('category'):
            query_parts.append(element['category'])
        if element.get('description'):
            query_parts.append(element['description'])

        # Add dimensions if available
        if element.get('thickness'):
            query_parts.append(f"{element['thickness']}mm thick")
        if element.get('height'):
            query_parts.append(f"{element['height']}m high")

        query = ' '.join(query_parts)

        # Determine category from element type
        category = self._get_category_from_element(element)

        return self.match(query, top_n=3, category=category)

    def _get_category_from_element(self, element: Dict) -> Optional[str]:
        """Map BIM element type to CSI category"""
        element_mapping = {
            'IfcWall': 'Concrete|Masonry',
            'IfcSlab': 'Concrete',
            'IfcColumn': 'Concrete|Metals',
            'IfcBeam': 'Concrete|Metals',
            'IfcDoor': 'Openings',
            'IfcWindow': 'Openings',
            'IfcRoof': 'Thermal',
            'IfcStair': 'Concrete',
            'IfcPipeSegment': 'Plumbing',
            'IfcDuctSegment': 'HVAC'
        }

        elem_type = element.get('ifc_type', '')
        return element_mapping.get(elem_type)

Cost Estimation Engine

Automated Estimator

class OpenConstructionEstimator:
    """Generate cost estimates using open databases"""

    def __init__(self, matcher: WorkItemMatcher, region: str = 'default'):
        self.matcher = matcher
        self.region = region
        self.regional_factors = self._load_regional_factors()
        self.estimates = []

    def _load_regional_factors(self) -> Dict[str, float]:
        """Load regional cost adjustment factors"""
        return {
            'default': 1.0,
            'northeast_us': 1.15,
            'southeast_us': 0.92,
            'midwest_us': 0.95,
            'west_us': 1.08,
            'moscow': 1.20,
            'spb': 1.10,
            'regions_ru': 0.85
        }

    def estimate_element(self, element: Dict) -> Dict:
        """Estimate cost for a single element"""
        # Get matching work items
        matches = self.matcher.match_bim_element(element)

        if not matches:
            return {
                'element_id': element.get('id'),
                'status': 'no_match',
                'estimated_cost': 0
            }

        best_match = matches[0]
        quantity = element.get('quantity', 1)
        unit_price = best_match['unit_price']

        # Apply regional factor
        regional_factor = self.regional_factors.get(self.region, 1.0)
        adjusted_price = unit_price * regional_factor

        # Calculate total
        total_cost = adjusted_price * quantity

        estimate = {
            'element_id': element.get('id'),
            'element_type': element.get('ifc_type'),
            'element_description': element.get('description', ''),
            'matched_code': best_match['code'],
            'matched_description': best_match['description'],
            'match_confidence': best_match['similarity'],
            'unit': best_match['unit'],
            'quantity': quantity,
            'unit_price': unit_price,
            'regional_factor': regional_factor,
            'adjusted_unit_price': adjusted_price,
            'total_cost': total_cost
        }

        self.estimates.append(estimate)
        return estimate

    def estimate_project(self, elements: List[Dict]) -> Dict:
        """Estimate entire project"""
        for element in elements:
            self.estimate_element(element)

        df = pd.DataFrame(self.estimates)

        # Summary by category
        if not df.empty:
            summary = df.groupby('element_type').agg({
                'total_cost': 'sum',
                'element_id': 'count',
                'match_confidence': 'mean'
            }).rename(columns={'element_id': 'count'})
        else:
            summary = pd.DataFrame()

        total = df['total_cost'].sum() if not df.empty else 0

        return {
            'total_cost': total,
            'element_count': len(elements),
            'matched_count': len(df[df['match_confidence'] > 0.5]) if not df.empty else 0,
            'summary_by_type': summary.to_dict() if not summary.empty else {},
            'details': self.estimates
        }

    def export_estimate(self, output_path: str) -> str:
        """Export estimate to Excel"""
        df = pd.DataFrame(self.estimates)

        with pd.ExcelWriter(output_path, engine='openpyxl') as writer:
            # Summary
            summary = pd.DataFrame({
                'Metric': ['Total Cost', 'Elements', 'Matched', 'Avg Confidence'],
                'Value': [
                    df['total_cost'].sum() if not df.empty else 0,
                    len(df),
                    len(df[df['match_confidence'] > 0.5]) if not df.empty else 0,
                    df['match_confidence'].mean() if not df.empty else 0
                ]
            })
            summary.to_excel(writer, sheet_name='Summary', index=False)

            # Details
            if not df.empty:
                df.to_excel(writer, sheet_name='Details', index=False)

                # By type
                by_type = df.groupby('element_type')['total_cost'].sum()
                by_type.to_excel(writer, sheet_name='By_Type')

        return output_path

    def get_missing_items(self) -> List[Dict]:
        """Get elements that couldn't be matched"""
        df = pd.DataFrame(self.estimates)
        if df.empty:
            return []

        low_confidence = df[df['match_confidence'] < 0.5]
        return low_confidence.to_dict('records')

Database Management

Creating and Updating Database

class OpenDatabaseManager:
    """Manage open construction pricing database"""

    def __init__(self, db_path: str):
        self.db_path = db_path
        self.db = self._load_or_create()

    def _load_or_create(self) -> pd.DataFrame:
        """Load existing or create new database"""
        try:
            return pd.read_csv(self.db_path)
        except FileNotFoundError:
            return pd.DataFrame(columns=list(WORK_ITEMS_SCHEMA.keys()))

    def add_items(self, items: List[Dict]):
        """Add new work items"""
        new_df = pd.DataFrame(items)
        self.db = pd.concat([self.db, new_df], ignore_index=True)
        self.db.drop_duplicates(subset=['code'], keep='last', inplace=True)

    def update_prices(self, updates: pd.DataFrame, year: int):
        """Update prices with new data"""
        for _, row in updates.iterrows():
            mask = self.db['code'] == row['code']
            if mask.any():
                self.db.loc[mask, 'unit_price'] = row['unit_price']
                self.db.loc[mask, 'year'] = year

    def apply_inflation(self, rate: float):
        """Apply inflation adjustment"""
        self.db['unit_price'] = self.db['unit_price'] * (1 + rate)

    def export_subset(self, category: str, output_path: str):
        """Export subset of database"""
        subset = self.db[
            self.db['category_l1'].str.contains(category, case=False, na=False)
        ]
        subset.to_csv(output_path, index=False)

    def save(self):
        """Save database"""
        self.db.to_csv(self.db_path, index=False)

    def get_statistics(self) -> Dict:
        """Get database statistics"""
        return {
            'total_items': len(self.db),
            'categories': self.db['category_l1'].nunique(),
            'avg_price': self.db['unit_price'].mean(),
            'price_range': (self.db['unit_price'].min(), self.db['unit_price'].max()),
            'latest_year': self.db['year'].max() if 'year' in self.db else None
        }

Quick Reference

| Category | CSI Division | Typical Items |

|----------|--------------|---------------|

| Concrete | 03 | Walls, slabs, columns, beams |

| Masonry | 04 | Brick, block, stone |

| Metals | 05 | Structural steel, misc metals |

| Finishes | 09 | Drywall, paint, flooring |

| MEP | 21-26 | Plumbing, HVAC, electrical |

| Sitework | 31-33 | Excavation, paving, utilities |

Resources

**OpenConstructionEstimate**: Open database initiative

**CSI MasterFormat**: https://www.csiresources.org/standards/masterformat

**DDC Website**: https://datadrivenconstruction.io

Next Steps

See `vector-search` for semantic item matching

See `cost-prediction` for ML-based estimation

See `qto-report` for quantity extraction

// Comments

// Related skills

More tools from the same signal band

Order food/drinks (点餐) on an Android device paired as an OpenClaw node. Uses in-app menu and cart; add goods, view cart, submit order (demo, no real payment).

Sign plugins, rotate agent credentials without losing identity, and publicly attest to plugin behavior with verifiable claims and authenticated transfers.

The philosophical layer for AI agents. Maps behavior to Spinoza's 48 affects, calculates persistence scores, and generates geometric self-reports. Give your...

日历管理数据处理

1 installs★ 0