Trello with GetCorrello as an Agile Kanban Tool

After initially using Trello with ‘Scrum for Trello’‘ to run our companies Agile process and having to manually create burn downs and release estimates with Excel (after mangling data from the Trello API and a Google Spread sheet) we moved to Trello with GetCorrello (and switched to Kanban after a few months with the teams consent)

Charts from this tool are below.   The GetCorrello team were able to add features on demand. Would use them again!

 https://getcorrello.com

Some samples

Burndown

image

Cumulative Flow Diagram

image

Estimated time to complete per label

image

Control chart

image

Cycle times

image

Stats

image

Slack integration

image

 

Our Kanban process

https://www.dropbox.com/s/fj1mr93gqhxv8nc/Fair%20Go%20Kanban%20Process.pptx?dl=0

Coursera Practical Machine Learning – Prediction Assignment




Introduction

This is an analysis for final assignment of the Coursea course ‘Practical Machine Learning’.

http://web.archive.org/web/20161224072740/http:/groupware.les.inf.puc-rio.br/har

Executive summary

The gradient boosted machine (0.9998) achieved a better accuracy than the random forsest (0.9997)

Background

Using devices such as Jawbone Up, Nike FuelBand, and Fitbit it is now possible to collect a large amount of data about personal activity relatively inexpensively. These type of devices are part of the quantified self movement – a group of enthusiasts who take measurements about themselves regularly to improve their health, to find patterns in their behavior, or because they are tech geeks. One thing that people regularly do is quantify how much of a particular activity they do, but they rarely quantify how well they do it. In this project, your goal will be to use data from accelerometers on the belt, forearm, arm, and dumbell of 6 participants. They were asked to perform barbell lifts correctly and incorrectly in 5 different ways. More information is available from the website here: http://groupware.les.inf.puc-rio.br/har (see the section on the Weight Lifting Exercise Dataset).

The exercises were performed by six male participants aged between 20-28 years, with little weight lifting experience.

Goal

Predict the manner in which a participant did the exercise (class A – properly – or any other incorrect way)

Data

Obtaining Data

The training data for this project are available here: https://d396qusza40orc.cloudfront.net/predmachlearn/pml-training.csv

The test data are available here: https://d396qusza40orc.cloudfront.net/predmachlearn/pml-testing.csv

save_file = function(url, name) {
    if (!file.exists(name)) {
        library(downloader)
        download(url, destfile = name)
    }
}

save_file("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-training.csv", "pml-training.csv")
save_file("https://d396qusza40orc.cloudfront.net/predmachlearn/pml-testing.csv", "pml-testing.csv")


string40 <- "ncnnccnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn"
string80 <- "nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn"
string120 <- "nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn"
string160 <- "nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnc"
colString <- paste(string40, string80, string120, string160, sep = "")

data.training <- readr::read_csv("pml-training.csv", col_names = TRUE, col_types = colString)
data.testing <- readr::read_csv("pml-testing.csv", col_names = TRUE, col_types = colString)
data.training <- as.data.frame(data.training)

Data Exploration

The goal of this project is to predict the manner in which they did the exercise. This is the “classe” variable in the training set, the last column. Let’s have a look at the training data. Both datasets have 160 rows with the rraining set having 160 observations and the testing having 20 observations.

dim(data.training)
## [1] 19622   160
dim(data.testing)
## [1]  20 160

The ‘classe’ variable is the indicator of the training outcome. Classe ‘A’ corresponds to the specified execution of the exercise, while the other 4 classes correspond to common mistakes. Below shows a plot of the distribution of this variable throughout the training set.

Cleaning Data

To clean up the data we remove columns where ALL values are NA

data.training <- data.training[, colSums(is.na(data.training)) == 0]
data.testing <- data.testing[, colSums(is.na(data.testing)) == 0]

Training

split the original training set because original test set does not have enough observations

data.include <- createDataPartition(data.training$classe, p = .70, list = FALSE)
data.train <- data.training[data.include,]
data.test <- data.training[-data.include,]

Model build – Random Forest

For this random forest model, we apply cross validation: the data is being splitted into five parts, each of them taking the role of a validation set once. A model is built five times on the remaining data, and the classification error is computed on the validation set. The average of these five error rates is our final error. This can all be implemented using the caret train function. We set the seed as the sampling happens randomly.

cat("Random Forest model started")
## Random Forest model started
cluster <- makeCluster(detectCores() - 1) # convention to leave 1 core for OS
registerDoParallel(cluster)
fitControl.rf <- trainControl(method = "cv",
                           number = 5,
                           allowParallel = TRUE)
timer.start <- Sys.time()
model.rf <- train(classe ~ ., data = data.train, method = "rf", trControl = fitControl.rf, verbose = FALSE, na.action = na.omit)
timer.end <- Sys.time()
stopCluster(cluster)
registerDoSEQ()
paste("Random Forest took: ", timer.end - timer.start, attr(timer.end - timer.start, "units"))
## [1] "Random Forest took:  2.51922355095545 mins"

Prediction – Random Forest

prediction.rf <- predict(model.rf, data.test)
confusion_matrix.rf <- confusionMatrix(prediction.rf, data.test$classe)
confusion_matrix.rf
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    A    B    C    D    E
##          A 1673    0    0    0    0
##          B    1 1138    0    0    0
##          C    0    1 1026    0    0
##          D    0    0    0  964    0
##          E    0    0    0    0 1082
## 
## Overall Statistics
##                                      
##                Accuracy : 0.9997     
##                  95% CI : (0.9988, 1)
##     No Information Rate : 0.2845     
##     P-Value [Acc > NIR] : < 2.2e-16  
##                                      
##                   Kappa : 0.9996     
##  Mcnemar's Test P-Value : NA         
## 
## Statistics by Class:
## 
##                      Class: A Class: B Class: C Class: D Class: E
## Sensitivity            0.9994   0.9991   1.0000   1.0000   1.0000
## Specificity            1.0000   0.9998   0.9998   1.0000   1.0000
## Pos Pred Value         1.0000   0.9991   0.9990   1.0000   1.0000
## Neg Pred Value         0.9998   0.9998   1.0000   1.0000   1.0000
## Prevalence             0.2845   0.1935   0.1743   0.1638   0.1839
## Detection Rate         0.2843   0.1934   0.1743   0.1638   0.1839
## Detection Prevalence   0.2843   0.1935   0.1745   0.1638   0.1839
## Balanced Accuracy      0.9997   0.9995   0.9999   1.0000   1.0000

Model build – Gradient Boosting Machine

Now we will do exactly the same, but use boosting instead of random forests. Getting the accuracy, predications… works with the same code.

cat("Gradient Boosting Machine model started")
## Gradient Boosting Machine model started
fitControl.gbm <- trainControl(method="cv",number=5,allowParallel=TRUE)
timer.start <- Sys.time()
model.gbm <- train(classe ~ ., data = data.train, method = "gbm", trControl = fitControl.gbm, verbose = FALSE, na.action = na.omit)
timer.end <- Sys.time()
paste("Gradient Boosting Machine took: ", timer.end - timer.start, attr(timer.end - timer.start, "units"))
## [1] "Gradient Boosting Machine took:  3.89540884892146 mins"

Prediction – Gradient Boosting Machine

cat("Gradient Boosting Machine predictions")
## Gradient Boosting Machine predictions
prediction.gbm <- predict(model.gbm, data.test)
confusion_matrix.gbm <- confusionMatrix(prediction.gbm, data.test$classe)
confusion_matrix.gbm
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction    A    B    C    D    E
##          A 1673    0    0    0    0
##          B    1 1138    0    0    0
##          C    0    1 1026    0    0
##          D    0    0    0  964    0
##          E    0    0    0    0 1082
## 
## Overall Statistics
##                                      
##                Accuracy : 0.9997     
##                  95% CI : (0.9988, 1)
##     No Information Rate : 0.2845     
##     P-Value [Acc > NIR] : < 2.2e-16  
##                                      
##                   Kappa : 0.9996     
##  Mcnemar's Test P-Value : NA         
## 
## Statistics by Class:
## 
##                      Class: A Class: B Class: C Class: D Class: E
## Sensitivity            0.9994   0.9991   1.0000   1.0000   1.0000
## Specificity            1.0000   0.9998   0.9998   1.0000   1.0000
## Pos Pred Value         1.0000   0.9991   0.9990   1.0000   1.0000
## Neg Pred Value         0.9998   0.9998   1.0000   1.0000   1.0000
## Prevalence             0.2845   0.1935   0.1743   0.1638   0.1839
## Detection Rate         0.2843   0.1934   0.1743   0.1638   0.1839
## Detection Prevalence   0.2843   0.1935   0.1745   0.1638   0.1839
## Balanced Accuracy      0.9997   0.9995   0.9999   1.0000   1.0000


R–Data Exploration

PCA

Principal Components Analysis (PCA) allows us to study and
explore a set of quantitative variables measured on a set of objects

Core Idea

With PCA we seek to reduce the dimensionality (reduce the number
of variables) of a data set while retaining as much as possible of the
variation present in the data

Before performing a PCA(or any other multivariate method) we
should start with some preliminary explorations

  • Descriptive statistics
  • Basic graphical displays
  • Distribution of variables
  • Pair-wise correlations among variables
  • Perhaps transforming some variables
  • ETC
image

The minimal output from any PCA should contain 3 things:

Eigenvalues provide information about the amount of
variability captured by each principal component

Scores or PCs (principal components) that provide coordinates to graphically represent objects in a lower dimensional space

Loadings provide information to determine what variables
characterize each principal component

Some questions to keep in mind
  • How many PCs should be retained?
  • How good (or bad) is the data approximation with the retained PCs?
  • What variables characterize each PC?
  • Which variables are influential, and how are they correlated?
  • Which variables are responsible for the patterns among objects?
  • Are there any outlier objects?

Map .html extension to the .Net razor view engine

1. Add the buildProvider config for razor inside the compilation element

image

2. Application start –> Registers the html extension with razor

image

3. Create a Start.cshtml page and ingest the index.html page

image

<compilation debug="true" targetFramework="4.6" > <buildProviders> <add extension=".html" type="System.Web.WebPages.Razor.RazorBuildProvider"/> </buildProviders> </compilation> System.Web.Razor.RazorCodeLanguage.Languages.Add("html", new CSharpRazorCodeLanguage()); WebPageHttpHandler.RegisterExtension("html");

@using Fasti.WebClient @{ Layout = null; @RenderPage("~/index.html") } <!-- Version + @System.Reflection.Assembly.GetAssembly(typeof (Startup)).GetName().Version.ToString(); -->

Value Tests

Having drank the TDD cool aid over the years I went deep.  BDD with Fitnesse, SpecFlow. Maximum code coverage %. Days fixing up UI tests. Ensuring this calls that – all code was covered.  

Today here is how I calculate a tests value.

  1. Will it help me discover a design?
  2. Is the functionality well known? (is this feature being tested to see if its used/needed)
  3. Will it stop hard to find bugs in the future?
  4. Is its value worth maintaining over time?
  5. Will the next person be able to understand and appreciate this test?

If YES then write the test!

Choosing a cloud platform offering

Firstly some terminology courtesy of Stackoverflow

IAAS (Infrastructure As A Service) :

  • The base layer

  • Deals with Virtual Machines, Storage (Hard Disks), Servers, Network, Load Balancers etc

PAAS (Platform As A Service) :

  • A layer on top of IAAS

  • Runtimes (like .net/java runtimes), Databases (like MSSql, Oracle), Web Servers (IIS/tomcat etc)

SAAS (Software As A Service) :

  • A layer on top on PAAS

  • Applications like email (Gmail, Yahoo mail etc), Social Networking sites (Facebook etc)

IAAS

PAAS

SAAS

Chocolatey machine setup

https://gist.github.com/chrismckelt/3884f94078a7bd3a773b

 

#Administrator privileges check If (-NOT ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole(` [Security.Principal.WindowsBuiltInRole] "Administrator")) { Write-Warning "You do not have Administrator rights!`nPlease run the build shell as administrator!" exit } $scriptPath = $MyInvocation.MyCommand.Path $scriptDirectory = Split-Path $scriptPath $customDir = Resolve-Path (Join-Path ($scriptDirectory) "..\") cd $customDir mkdir "c:\software" mkdir "c:\downloads" mkdir "c:\temp" Write-Host "Welcome to chrismckelts Chocolatey install" Write-Host "---------------------------------" get-item "$*.bat" Write-Host "---------------------------------" ECHO "Installing PsGet and PS plugins" (new-object Net.WebClient).DownloadString("http://psget.net/GetPsGet.ps1") | iex Install-Module pscx Install-Module psake ECHO "FINISHED - Installing PsGet and PS plugins - FINISHED" ECHO "Installing Chocolatey and required apps" iex ((new-object net.webclient).DownloadString("https://chocolatey.org/install.ps1")) [Environment]::SetEnvironmentVariable("Path", $env:Path + ";" + $env:systemdrive + '\chocolatey\bin', [System.EnvironmentVariableTarget]::Machine ) choco feature enable -n=allowGlobalConfirmation choco install dotnet4.5 choco install 7zip choco install slickrun choco install fiddler choco install curl choco install GoogleChrome #choco install Firefox choco install grepwin choco install ConEmu choco install notepadplusplus choco install paint.net choco install linqpad choco install PowerGUI choco install P4Merge choco install clover choco install fiddler choco install sourcetree choco install googledrive choco install vlc choco install nodejs.install choco install msbuild.communitytasks choco install procmon choco install dotPeek choco install keepass #choco install logparser1 #choco install VirtualCloneDrive choco install foxitreader choco install git-credential-winstore choco install mRemoteNG #$webclient = New-Object Net.WebClient # $url = 'http://download.microsoft.com/download/E/A/3/EA38D9B8-E00F-433F-AAB5-9CDA28BA5E7D/FSharp_Bundle.exe' # $webclient.DownloadFile($url, "$pwd\FSharp_Bundle.exe") .\FSharp_Bundle.exe /install /quiet ECHO "FINISHED - Installing Chocolatey and required apps - FINISHED"

From DotNetBlogEngine to WordPress

So after losing 6 months worth of posts I have now converted this blog to use WordPress.

To transfers the posts I wrote a simple linqpad script that uses WordPressSharp

After trying Syntax highlighter I instead choose this code formatter.

 

// need nuget WordPressSharp internal class FileStore { public IList<Post> Posts { get; private set; } public void ReadFiles() { const string folderPath = @"C:\temp\posts"; Posts = new List<Post>(); var files = Directory.GetFiles(folderPath); files.ToList().ForEach(ProcessFile); } private void ProcessFile(string filepath) { //Load xml XDocument xdoc = XDocument.Load(filepath); //Run query var posts = from lv1 in xdoc.Descendants("post") select new Post() { Title = lv1.Element("title").Value, Content = lv1.Element("content").Value, PublishDateTime = Convert.ToDateTime(lv1.Element("pubDate").Value), CustomFields = GetDesc(new KeyValuePair<string, string>("Description", lv1.Element("description").Value)) }; posts.ToList().ForEach(Posts.Add); } public CustomField[] GetDesc(params KeyValuePair<string,string>[] fieldsToAdd) { var list = new List<CustomField>(); foreach (var keyValuePair in fieldsToAdd) { list.Add(new CustomField() { Id=Guid.NewGuid().ToString(), Key=keyValuePair.Key, Value = keyValuePair.Value }); } return list.ToArray(); } } private static void Main(string[] args) { var url = "http://www.mckelt.com/blog/"; var cfg = new WordPressSiteConfig { BaseUrl = "http://mckelt.com/blog/xmlrpc.php", BlogId = 1, Username = "username", Password = "password" }; var fs = new FileStore(); fs.ReadFiles(); foreach (var post1 in fs.Posts.OrderBy(x=>x.PublishDateTime)) { post1.Status = "publish"; using (var client = new WordPressClient(cfg)) { client.NewPost(post1); } } } }