Below is a theorized list of steps how the process of executing the DTO ML project could go. Please email me your feedback, DBrokaw@formarchitecture.com
The DTO ML project will be an exercise in learning ML, not producing amazing insights from AI. We picked the following project to keep things very simple so we can complete it in a short time frame. In our machine learning model, we will use it to analyze the time stamps of when Revit is opened and closed in a log file created in CSV format. In this scenario, we will use a supervised learning algorithm, specifically a binary classification model, to determine the usage patterns of Autodesk Revit files to help understand the demand for Revit licenses within your organization. This information can help determine if optimized license distribution with a Flex Token is a good fit for your organization and, if so, for who and what you should expect to pay. Revit Tokens are charged every 24 hours while the product or service is in use, at a rate of 10 Tokens per 24hrs at the cost of $30/day. This must be weighed against a yearly unlimited license at the cost of $3,430 /year.
The steps to perform this analysis are as follows:
Step 0: Get the data
Build a basic Revit add-in that collects data by using the Revit's API to subscribe to the ApplicationInitialized Event and ApplicationClosing Event. When these events are triggered, it will write a timestamp to a CSV file on an accessible server on-prim. Next, we can prepare the data by collecting everyone's log files and combining them to train on. No one is required to share your company's data, and the plugin will not send your data anywhere external to your organization. If sharing Revit open and closed data is not possible for most, we will generate a training set of fake data with a python script. We can also anonymize the log files, so no names are used, by means of replacing actual usernames with generic usernames like user1 and so on.
/*
* Summary: This code defines a Revit add-in class that implements the IExternalApplication interface and
* subscribes to the ApplicationInitialized and ApplicationClosing events. The add-in writes a time stamp
* and username to a log file on the C drive root folder when these events occur.
*/
using System;
using System.IO;
using System.Security.Principal; // Namespace for WindowsIdentity class
using Autodesk.Revit.ApplicationServices;
using Autodesk.Revit.Attributes;
using Autodesk.Revit.DB;
using Autodesk.Revit.UI;
namespace YourNamespace
{
// The Transaction and Regeneration attributes specify the transaction and regeneration modes for the add-in
[Transaction(TransactionMode.Manual)]
[Regeneration(RegenerationOption.Manual)]
public class YourAddInClass : IExternalApplication
{
// Specify the path to the log file
private string logFilePath = @"C:\logfile.csv";
// The OnStartup method is called when the add-in starts up
public Result OnStartup(UIControlledApplication application)
{
// Subscribe to the ApplicationInitialized event
application.ControlledApplication.ApplicationInitialized += OnApplicationInitialized;
// Subscribe to the ApplicationClosing event
application.ControlledApplication.ApplicationClosing += OnApplicationClosing;
// Return a Result to indicate that the add-in has started up successfully
return Result.Succeeded;
}
// The OnShutdown method is called when the add-in shuts down
public Result OnShutdown(UIControlledApplication application)
{
// Return a Result to indicate that the add-in has shut down successfully
return Result.Succeeded;
}
// The OnApplicationInitialized method is called when the ApplicationInitialized event occurs
private void OnApplicationInitialized(object sender, EventArgs e)
{
// Get the current user's Windows identity and extract the user name
WindowsIdentity identity = WindowsIdentity.GetCurrent();
string username = identity.Name.Split('\\')[1];
// Write a time stamp and username to the log file
using (StreamWriter sw = File.AppendText(logFilePath))
{
sw.WriteLine(DateTime.Now.ToString() + "," + username + ",ApplicationInitialized");
}
}
// The OnApplicationClosing method is called when the ApplicationClosing event occurs
private void OnApplicationClosing(object sender, EventArgs e)
{
// Get the current user's Windows identity and extract the user name
WindowsIdentity identity = WindowsIdentity.GetCurrent();
string username = identity.Name.Split('\\')[1];
// Write a time stamp and username to the log file
using (StreamWriter sw = File.AppendText(logFilePath))
{
sw.WriteLine(DateTime.Now.ToString() + "," + username + ",ApplicationClosing");
}
}
}
}
Step 0.1: Example of the data
Here's an example of what the log file might look like.
Timestamp,Username,Event
2023-03-01 06:28:45,User1,ApplicationInitialized
2023-03-01 08:12:19,User2,ApplicationInitialized
2023-03-01 08:42:11,User3,ApplicationInitialized
2023-03-01 09:17:33,User1,ApplicationClosing
2023-03-01 09:34:57,User4,ApplicationInitialized
2023-03-01 10:08:21,User5,ApplicationInitialized
2023-03-01 10:51:14,User6,ApplicationInitialized
2023-03-01 12:02:07,User7,ApplicationInitialized
2023-03-01 12:26:12,User3,ApplicationClosing
2023-03-01 12:52:33,User8,ApplicationInitialized
2023-03-01 13:39:45,User2,ApplicationClosing
Step 1: Load the data
The first step is to load the data into a Pandas DataFrame. The log file contains two columns, one for the time stamp when the application is opened and another for the time stamp when the application is closed.
import pandas as pd
# Load the data into a Pandas DataFrame
df = pd.read_csv('log_file.csv')
Step 2: Feature engineering
Next, we need to extract features from the time stamp data that can be used by the machine learning model. We can extract the hour of the day, day of the week, and minute of the hour from the time stamp data.
# Extract features from the time stamp data
df['hour'] = pd.to_datetime(df['timestamp']).dt.hour
df['day_of_week'] = pd.to_datetime(df['timestamp']).dt.dayofweek
df['minute'] = pd.to_datetime(df['timestamp']).dt.minute
Step 3: Prepare the data
Next, we need to prepare the data for the machine learning model. We will create a binary classification task where we predict whether the application was opened or closed based on the time stamp data.
# Create the target variable
df['target'] = df['event_type'].apply(lambda x: 1 if x == 'open' else 0)
# Split the data into training and testing sets
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
df[['hour', 'day_of_week', 'minute']], df['target'], test_size=0.2, random_state=42)
Step 4: Train the model
Next, we will train the machine learning model using the training data.
# Train the model
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train, y_train)
Step 5: Evaluate the model
Finally, we will evaluate the performance of the model using the testing data. In this example, we used a logistic regression model to perform the binary classification task.
# Evaluate the model
from sklearn.metrics import accuracy_score
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print('Accuracy:', accuracy)
Next, we will start a new test to learn how to use ML for basic image analysis... (stay tuned)
want to be a participant who gets hands-on, email me, DBrokaw@ForumArchitecture.com