Code's Library

All available codes are written in Stata. If you don't own a Stata copy and would like to run the codes below, check with the IT department of your institution - they may have a copy available. Stata codes can also be open and fully read using any text editor (do files - the format of Stata code files - are text files).

This code determines in silico the RFLP of PRRSv ORF5 sequences. Sequences need not to be aligned to a reference (but portions not pertaining to the ORF5 viral genome must be deleted beforehand) and capitalization of nucleotides does not matter. The code is very conservative - if your sequence contains an ambiguity in any site, it will overwrite this on the RFLP. You may turn this off at the end of the code. Since RFLP is position-based, inclusion or deletions will yield unexpected results. This code mimics cuts found using NebCutter.

Click do file icon on the left to download code.

PRRSv Glycosylation detection

This code detects glycosylation spots on PRRS Type-2 sequences aligned as amino acids.

Click do file icon on the left to download code.

L-SL of PRRSV stacked bar plot - UPDATED JQ 2023

The code below constructs a graph displaying the colors of L/SL using JQ 2023 classification. The code below enters the colors as HEX and the figure on the right shows the colors as RGB codes.

capture program drop colorpalette_lslcolors_JQ

program colorpalette_lslcolors_JQ

c_local P #00ffff,#c10534,#00b000,#ccefcc,#80d880,#4dc84d,#008e00,#006a00,#ffd200,#800080,#ff0000,#d7d29e,#e700e7,#595959,#757575,#929292,#9b8989,#b1b1b1,#d0d0d0,#f1f1f1,#ffe3e3,#dff8f8,#f5eeee,#262626,#353a30,#627550,#819c65,#9fc47a,#c3c3c3

c_local I L1A ,L1B ,L1Cunc ,L1C1 ,L1C2 ,L1C3 ,L1C4 ,L1C5 ,L1D ,L1E ,L1F ,L1H ,L1I ,L2 ,L4 ,L5A ,L5B ,L6 ,L7 ,L8A ,L8B ,L8C ,L8D ,L9A ,L9B ,L9C ,L9D ,L9E ,L11

end

colorpalette lslcolors_JQ

capture program drop colorpalette_lslcolors_JQ

L-SL of PRRSV stacked bar plot - OUTDATED

This code constructs the L/SL plot of count of sequences of PRRSV identified per year according to their L/SL. The main advantage of this info presented here is the standardization of colors. You will find the specification of each color (both HEX and RGB codes) below.

Click do file icon on the left to download code.

L1A - #00FFFF or 0 255 255 L1B - #C10534 or 193 5 52 L1C - #00B000 or 0 176 0 L1D - #FFD200 or 255 210 0

L1E - #800080 or 128 0 128 L1F - #FF0000 or 255 0 0 L1G - #938DD2 or 147 141 210 L1H - #D7D29E or 215 210 158

 Type 1 - #3E3E3E or 62 62 62 L2 - #595959 or 89 89 89 L4 - #757575 or 117 117 117 L5 - #929292 or 146 146 146

L6 - #B1B1B1 or 177 177 177 L7 - D0D0D0 or 208 208 208 L8 - #F1F1F1 or 241 241 241 L9 - #262626 or 38 38 38

L1C 1-4-4: #00FF80  0 255 128  L1C 1-2-4: ?

  

Example of the graph that this code generates.

Nucleotide percent difference against a reference

This code constructs graphs reporting the % difference between two sequences according to different sized windows of NT.

Click do file icon on the left to download code.

 

 

 

Example of the graph that this code generates.

Regression Coeficients plot

This code constructs a plot that displays the regression coefficients of one or more regression models. 

Click do file icon on the left to download code.

 

 

 

Example of the graph that this code generates.

Explore tuning of ML models

This code runs a ML model with varying parameters and plot the impact that those varying parameters have on sensitivity, specificity, accuracy and F-1 score of the predictions.

Click do file icon on the left to download code.

 

 

 

Example of the graph that this code generates, displaying at which tree depth and n_estimator combinators the F1 score is maximized. Similar graphs are generated for other performance indicators (sensitivity, specificity, accuracy, F1).