10/14 (CSV meeting)
CSV includes mixed types, shared parameters per type
10/7 (Autodetect working session meeting):
Out of scope: Using a CSV to do bulk uploads
Out of scope: Allowing users to order specific files to have analysis done first, or applying ordering in the queue (https://trussworks.slack.com/archives/C018H74LGNL/p1602000690035400)
Out of scope: adding or deleting additional jobs to a job already in progress
Possibly out of scope: restarting individual failed jobs in a batch run (depends on level of effort to implement)
10/1:
Reiterated that these are Not in Scope:
Import from Web apps
Running multiple parameters for the same bulk app
9/28: Genomes are STILL in scope for uploading, as well as the scope below:
Focused on reads, assemblies and genomes (a.k.a. MAGs)
9/8:
FAPROTAX and PICRUST out of scope for Analysis
Steve to discuss with Dylan more
9/2: Scope meeting with Adam and team
Follow-ups
Steve talked to Dylan about any analysis apps that might already exist: most not in beta, still in dev; may need to follow up with Dylan and Pamela W.
This is our notepad for where we can record themes, high-level user outcomes, epics and stories.
https://miro.com/app/board/o9J_knTHCtM=/?moveToWidget=3074457349556854726&cot=12
List of features based on the HMW solutioning exercise
Bulk Upload UI improvements
Autodetect
Naming conventions
CSV Upload for uploading and importing files only, not for batch analysis
Batch analysis
Shared parameters for single bulk job
Original Scope Presentation
Data upload improvements:
Focused on reads, assemblies and genomes (a.k.a. MAGs)
Target use cases:
Kelly Wrighton WHONDRS data + analysis
ENIGMA data + analysis
This project will only allow for *sets
Homogenous set types
Support Homogenous groups of objects or *Sets associated with SampleSets can be used as well
List of apps and data types in scope to support:
Affected Apps and Types Scope: https://docs.google.com/spreadsheets/d/1MoVfXBQVzvbUI_LgFuF_-xQ1d0dQUbTY2g4oyohRfIg/edit#gid=0
Reference: KBase's own list of apps overall, with a deprecation tab and current list (might not be complete) https://docs.google.com/spreadsheets/d/1FHCWkqrT5J2FgCj1pF-KVRh366iQGWzyWHGN-s35lAI/edit#gid=1903256230
No new analysis tools, pipelines
No support for JGI MAG data
No processing of new data types not already handled
No mixed data type processing within a set
No technical work by this team on implementing SampleSets (creation/editing, filtering, viewing or GUIs for subselecting for an app running)
Data palettes
Import from Web apps
Running multiple parameters for the same bulk app
Redesigning the Import UI (vs. improving the UX through an app cell)
Allowing the user to order specific files to have analysis done first, or applying ordering in the queue
A repeatable model for product management, design and software development best practices customized to the specific needs of KBase
Artifacts detailing the iterative software development methodologies used on the KBase project to be used as a future model for KBase team’s continued work in iterative software design beyond the scope of this project..
Received assistance and training in the process of delivering:
A prototype to fulfill the highest-priority use case - uploading datasets into the KBase system so the user can easily upload compendia (collection of organized related data of a single type) and perform meaningful analyses to generate high-value data products from that data.
Infrastructure changes that enable upload and batch operations which serve Knowledge Engine and Concierge needs.
Scoping Document for this project that we worked on a bit with them. Success measures are not complete: https://docs.google.com/document/d/1MduFTlesTQSjkDNfWnJ8Wu0tB-o1G2BFuRTsKbhtPkQ/edit?ts=5ec2c33f#heading=h.340xn1cjmpgi
Initial Success Measures: [updated 8/27/20]
Improve the bulk upload user experience
Measured by:
Getting positive feedback from internal stakeholders including KBase staff, JMC, Janaka, Lauren Lui
Getting positive feedback from external stakeholders
Can get a quantitative assessment, like scale from 1-5--use as a relative measure, comparing iterations (not an absolute measure)
Reduce effort user needs to do bulk upload and import [TBD #]
Users are able to upload 10s-100s of data objects in a single action using bulk upload
Measured by:
Lower error rate per active users for the month uploading 10-100s of data objects in a single action than pre-release (may need to wait until there's enough volume on the system post-release to measure)
(accounting for fewer users)
Lower # of Helpdesk tickets (%TBD)
Tie to error rates
Error recovery - people find it hard to recover from errors without asking for help
(TBD %) increase in data being uploaded overall
Getting positive feedback on increase in data being uploaded for some users (revisit user interviewees)
Adoption - how to measure usage
3. Analysis: Users can run analysis to match the WHONDRS data analysis narrative, scaled to 10s to 100s of objects
Measured by:
Users able to run analysis on the WHONDRS narrative workflow on multiple objects (10+)
Users able to run analysis on the ENIGMA narrative workflow on multiple objects (TBD)
Users find that the bulk analysis user experience is improved since pre-release [interviews with stakeholders, template for the future]
Users can do bulk actions without having to use code cells (having that tech skill)
If we improve the bulk data import user experience and run analysis-oriented KBase apps on a set of commonly used data types--users will upload more data to be shared on the system
If we improve the bulk data import user experience and run analysis-oriented KBase apps on a set of commonly used data types--groups will be more likely to bring high-value data into KBase because they will be more self sufficient
[ENIGMA and WHONDRS data without involving KBase staff to upload]
November/December:
Alpha: Bulk Upload for internal Users
KBase staff
Janaka, JMC, ENIGMA group
January/February 2021:
Alpha: Bulk Analysis for internal users
KBase staff
Janaka, JMC, ENIGMA group
Beta: Bulk Upload for testing with partners
March/April 2021:
Beta: Bulk Upload testing with Kelly Wrighton’s group, JGI
Beta: Bulk Analysis testing with Enigma and Kelly Wrighton’s group