logo
down
shadow

extract certain data from multiple excel files with R


extract certain data from multiple excel files with R

By : swensonia
Date : November 22 2020, 03:01 PM
should help you out I import my data from multiple Excel files into R, and my data looks like this in R (there could be 100+ files each day): , This should do the trick:
code :
library(tidyverse)

df_list %>% 
  map_dfr(filter, Code == 7229) %>% 
  write_csv(path = "/INSERT/PATH/HERE/text.csv")
df_1 <- tribble(
  ~ST,  ~Code, ~Emp, ~Employee, ~Pay.Code,               ~Hours, ~Gross,
  "AL", 7229,  65,   "S",       "HOURLY",                0.00,   0.00,
  "AL", 7229,  65,   "S",       "SALARY",                0.00,   3060.00,
  "AL", 7229,  65,   "S",       "PER DIEM",              0.00,   765.00,
  "AL", 7229,  65,   "S",       "EXPENSE REIMBURSEMENT", 0.00,   11.00,
  "CA", 42,    2,    "R",       "HOURLY",                60.00,  720.00,
  "CA", 42,    2,    "R",       "OVERTIME",              3.25,   58.50,
  "CA", 42,    3,    "A",       "HOURLY",                80.00,  800.00,
  "CA", 42,    3,    "A",       "OVERTIME",              6.25,   93.75,
  "CA", 42,    4,    "N",       "HOURLY",                79.25,  990.63,
  "CA", 42,    4,    "N",       "OVERTIME",              7.00,   131.25,
  "CA", 42,    9,    "P",       "HOURLY",                32.00,  352.00,
  "CA", 42,    9,    "P",       "OVERTIME",              1.75,   28.88,
  "CA", 42,    10,   "E",       "HOURLY",                72.00,  864.00,
  "CA", 42,    10,   "E",       "OVERTIME",              5.00,   90.00
)

df_2 <- tribble(
  ~ST, ~Code, ~Employee, ~Pay.Code,    ~Gross,
  "AL", 7229,       NA,       NA,  23954.0,
  "AL", 8380,       NA,       NA,  11092.1,
  "GA", 7380,       NA,       NA,  98142.0,
  "GA", 8380,       NA,       NA,  11984.0,
  "NC", 7380,       NA,       NA, 218129.0,
  "NC", 8380,       NA,       NA,  27891.0,
  "TN", 7380,       NA,       NA,  28441.0,
  "TN", 8380,       NA,       NA,   8348.0
)

df_list <- list(df_1, df_2)

df_list %>% 
  map_dfr(filter, Code == 7229) %>% 
  write_csv(path = "/INSERT/PATH/HERE/text.csv")
# A tibble: 5 x 7
     ST  Code   Emp Employee              Pay.Code Hours Gross
  <chr> <dbl> <dbl>    <chr>                 <chr> <dbl> <dbl>
1    AL  7229    65        S                HOURLY     0     0
2    AL  7229    65        S                SALARY     0  3060
3    AL  7229    65        S              PER DIEM     0   765
4    AL  7229    65        S EXPENSE REIMBURSEMENT     0    11
5    AL  7229    NA     <NA>                  <NA>    NA 23954


Share : facebook icon twitter icon
How can I extract data from multiple files in a folder of excel

How can I extract data from multiple files in a folder of excel


By : Muhammed Afsal
Date : March 29 2020, 07:55 AM
it fixes the issue You can use the FileSystemObject to get all the name of the files you need. and then do want you want in a loop
code :
Dim fso As Object
Dim folder As Object
Dim file As Object
Dim xlWb As Workbook

Set fso = CreateObject("Scripting.FileSystemObject")
Set folder = fso.GetFolder("your\folder")

for each file in folder.files
  Set xlWb = Workbooks.Open(file.Path & "\" & file.Name)
  'your code here
  xlWb.Close
next
Search and extract data from multiple excel files with condition

Search and extract data from multiple excel files with condition


By : Mohmmed Asif
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , First of all - your Test.xlsx should be a xlsm if your code is in there. All the other files (just with data) - if they have no code - should be xlsx instead.
Now - try this code with the above changes in a module in Test.xlsm:
code :
Sub openFilesExtractData()

    Dim folderPath As String, path As String, yourText As String
    Dim currWbSh As Worksheet
    Dim i As Long, j As Long, k As Long

    folderPath = ThisWorkbook.path

    path = folderPath & "\*.xlsx"

    Filename = Dir(path)

    j = ThisWorkbook.Worksheets(1).UsedRange.Rows.count + 1
    k = 1

    Do While Filename <> ""

        If Filename <> ThisWorkbook.Name And Filename <> "" Then

            Workbooks.Open folderPath & "\" & Filename

            Set currWbSh = Workbooks(Filename).Worksheets(1)

            yourText = InputBox("What are you searching for?")

            For i = 1 To currWbSh.UsedRange.Rows.count

                Select Case yourText

                    Case currWbSh.Cells(i, 5):

                        ThisWorkbook.Worksheets(1).Cells(j, k) = yourText
                        k = k + 1

                    Case currWbSh.Cells(i, 6):

                        ThisWorkbook.Worksheets(1).Cells(j, k) = yourText
                        k = k + 1

                End Select

                If i = currWbSh.UsedRange.Rows.count And k <> 1 Then

                    ThisWorkbook.Worksheets(1).Cells(j, k) = Filename
                    j = j + 1

                End If

            Next i

            Workbooks(Filename).Close False

        End If

        Filename = Dir()
        k = 1
        j = ThisWorkbook.Worksheets(1).UsedRange.Rows.count + 1

    Loop

End Sub
How do I extract data from multiple text files to Excel using Python? (One file's data per sheet)

How do I extract data from multiple text files to Excel using Python? (One file's data per sheet)


By : Rohit Thakur
Date : March 29 2020, 07:55 AM
help you fix your problem If you're not opposed to having the outputted excel file as a .xlsx rather than .xls, I'd recommend making use of some of the features of Pandas. In particular pandas.read_csv() and DataFrame.to_excel()
I've provided a fully reproducible example of how you might go about doing this. Please note that I create 2 .txt files in the first 3 lines for the test.
code :
import pandas as pd
import numpy as np
import glob

# Creating a dataframe and saving as test_1.txt/test_2.txt in current directory
# feel free to remove the next 3 lines if yo want to test in your directory
df = pd.DataFrame(np.random.randn(10, 3), columns=list('ABC'))
df.to_csv('test_1.txt', index=False)
df.to_csv('test_2.txt', index=False)

txt_list = [] # empty list
sheet_list = [] # empty list

# a for loop through filenames matching a specified pattern (.txt) in the current directory
for infile in glob.glob("*.txt"): 
    outfile = infile.replace('.txt', '') #removing '.txt' for excel sheet names
    sheet_list.append(outfile) #appending for excel sheet name to sheet_list
    txt_list.append(infile) #appending for '...txt' to txtt_list

writer = pd.ExcelWriter('summary.xlsx', engine='xlsxwriter')

# a for loop through all elements in txt_list
for i in range(0, len(txt_list)):
    df = pd.read_csv('%s' % (txt_list[i])) #reading element from txt_list at index = i 
    df.to_excel(writer, sheet_name='%s' % (sheet_list[i]), index=False) #reading element from sheet_list at index = i 

writer.save()
Extract tables from multiple files to Excel using VBA

Extract tables from multiple files to Excel using VBA


By : Mathijs
Date : March 29 2020, 07:55 AM
I wish this help you As per PEH's and my comments earlier, here is an approach
Copy the below UDF in a module:
code :
Sub LookForWordDocs()
    Dim sFoldPath As String: sFoldPath = "c:\temp\"     ' Change the path. Ensure that your have "\" at the end of your path
    Dim oFSO As New FileSystemObject                    ' Requires "Microsoft Scripting Runtime" reference
    Dim oFile As file

    ' Loop to go through all files in specified folder
    For Each oFile In oFSO.GetFolder(sFoldPath).Files

        ' Check if file is a word document. (Also added a check to ensure that we don't pick up a temp Word file)
        If (InStr(1, LCase(oFSO.GetExtensionName(oFile.Path)), "doc", vbTextCompare) > 0) And _
                (InStr(1, oFile.Name, "~$") = 0) Then

            ' Call the UDF to copy from word document
            CopyTableFromWordDoc oFile

        End If

    Next

End Sub
Sub CopyTableFromWordDoc(ByVal oFile As file)
    Dim oWdApp As New Word.Application                      ' Requires "Microsoft Word .. Object Library" reference
    Dim oWdDoc As Word.Document
    Dim oWdTable As Word.Table
    Dim oWS As Worksheet
    Dim lLastRow$, lLastColumn$

    ' Code to copy table from word document to this workbook in a new worksheet
    With ThisWorkbook

        ' Add the worksheet and change the name to what file name is
        Set oWS = .Worksheets.Add
        oWS.Name = oFile.Name

        ' Open Word document
        Set oWdDoc = oWdApp.Documents.Open(oFile.Path)

        ' Set table to table 3 in the document
        Set oWdTable = oWdDoc.Tables(1)

        ' Copy the table to new worksheet
        oWdTable.Range.Copy
        oWS.Range("A1").PasteSpecial xlPasteValues, xlPasteSpecialOperationNone

        ' Close the Word document
        oWdDoc.Close False

        ' Close word app
        oWdApp.Quit

    End With

End Sub
Extract data from specific cells in multiple excel files - R

Extract data from specific cells in multiple excel files - R


By : alvarisi
Date : March 29 2020, 07:55 AM
it should still fix some issue I am not using read_excel and you did not provide a MRE so I could not test it, but you could try this.
code :
df.list <- lapply(file.list, read_excel, sheet=1, range="E6:E7", col_names=FALSE, col_types = NULL))
Related Posts Related Posts :
  • how to loop for division funciton in r
  • Why does ggplot not allow suppressing of messages generated by its geoms?
  • Download multiple excel files linked through urls in R
  • sparklyr : spark_apply function is not working in cluster mode
  • dplyr mutate - How do I pass one row as a function argument?
  • R selecting rows by conditions given in an external table
  • Native regex way to replace multiple leading chars with equal number spaces
  • stan - difficulty vectorizing
  • How to define a function that calls shiny functions?
  • How to count number of observations in a "n" dimensional range in R
  • Superimposing asymmetric t-distribution using ggplot2
  • Makefile to render all targets of all .Rmd files in directory
  • Authentication failure with rdrop2
  • DT data table display error
  • Issue when adding new rows (with nested dataframes within) to a dataframe
  • R-How to compare two dataframe and update list column value
  • Series vector for approximating pi
  • what is difference between "variance explained " in Random Forest and "merror" in XGBoost
  • R - Cast dataframe on unique rows - reshape2
  • ggplot2: plot correct proportions using geom_bar
  • Speedup query for R data.table - can this two-argument function be applied by group more quickly?
  • apply a function to several columns at once with mutate
  • R 'cowplot' neatly produce gridded plot with shared (common) legends and unique legends
  • Repeat R script for many times and save results to text file
  • How to negative lookbehind for special characters
  • data.table inner join produces error when no match is found
  • Create a new column base on existing column, but row above
  • Is there a way to visualize the process of source() in RStudio?
  • google places api consumes 10 request but I am doing only 1
  • Statistical mode of a categorical variable in R (using mlv)
  • Using for-loop to mutate a data.frame in r
  • Make plot with regression line for mixed model
  • Shortcut to select matces cases in R studio
  • vectoriced norm/matrix multiplication
  • Negative log10 transformation in R
  • Plot data with duplicate points
  • Visualizing crosstab tables with a plot in R - changing colours
  • How to manually modify automated numbers and labels in plot
  • How can I follow any redirections of a url in R?
  • Add jitter to box plot using markers in plotly
  • Adding an extra item to the legend
  • ggplot fills in data in the wrong order
  • Convert list to data frame
  • R: filtering by list(s) of strings and returning all results that start with the content of the lists
  • R:How to attach parts of a data frame with different headers and/or an overflowing piece of the dat frame
  • How to use 'par' for manipulating plot margins?
  • Can dplyr::case_when return mix of NAs and non-NAs?
  • Text preprocessing and topic modelling using text2vec package
  • Uploading multiple files in Shiny, process the files, rbind the results and return a download
  • R levelplot: color green-white-red (white on 0) according to one variable, but show the values of another variable
  • Why [i] doesn't point to the starting point in a vector
  • In R after generating a mvrnorm distribution, Y, what does Y[,1] do?
  • expand a data frame to have as many rows as range of two columns in original row
  • Getting started with R and CFA
  • Re order x-axis in ggplot so time goes from 12AM to 11PM in R
  • R - Automatically stack every nth column of a data frame and save them as new objects
  • How to format dplyr output in R into doubles (or other workable format)?
  • Dataframe to matrix conversion using tapply turns zeros to NAs
  • Smallest multiple of 1:20 - How can I make it quicker?
  • How to specify the size of a graph in ggplot2 independent of axis labels
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org