Merge 2 files by first column
Slack CH Sept 22nd 2021
Stored on server: /space/chen-syn01/1/data/cinliu/data/merge_by_col1
Slack CH Sept 22nd 2021
Stored on server: /space/chen-syn01/1/data/cinliu/data/merge_by_col1
Stored on server: /space/chen-syn01/1/data/cinliu/data/merge_by_col1
Hi Nini, could you help me merge these two files by the first column (ie, sample ID)?
Please merge the big file into the small one so resulting in 33 rows. Thanks!
The subjects absent in the small file can be omitted.
Excel file (big file): 41588_2012_BFng2335_MOESM53_ESM.xls
2700 rows x 14 columns
Small file: igsr_samples (1).tsv
33 rows x 9 columns
Clear work space, set working directory. Input files.
rm(list = ls())
setwd('/Users/nini/Desktop/2021lab/CH/merge_by_1st_col/')
big_df <- read_excel("41588_2012_BFng2335_MOESM53_ESM.xls")
small_df <- read.delim2("~/Desktop/2021lab/CH/merge_by_1st_col/igsr_samples (1).tsv")
Make ID the common identifier. Then use the merge function to merge and indicate ID as the common factor.
My personal favorite video tutorial of how to use merge function:
Merge Data Frames by Column Names in R (Example) | Combine with merge Function : https://www.youtube.com/watch?v=rlvWJdjYo1gSave as tab separated file
#merge data small_df$ID <- small_df$Sample.name #make ID the common name to refer to. Not removing Sample.name because in the end that is the name we want to keep when saving tsvFull code:
merge_by_1st_col.R