logo
down
shadow

R - Web scrapping and downloading multiple zip files and save the files without overwriting


R - Web scrapping and downloading multiple zip files and save the files without overwriting

By : shadySource
Date : November 20 2020, 03:01 PM
like below fixes the issue Trying to download multiple zip files using a web link. With this approach, the download files are getting overwritten since the file names are same for multiple years - , Here is what I got so far -
code :
#load the library
library(rvest)

#link to get the data from
url <- "https://download.open.fda.gov/"
page <- read_html(url)

#clean the URL
zips <- grep("\\/drug-event",html_nodes(page,"key"), value=TRUE)
zips_i<-gsub(".*\\/drug\\/","drug/",zips)
zips_ii<-gsub("</key>","",zips_i)
zips_iii<-paste0(url, zips_ii)

#destination vectors
id=1:length(zips_iii)
destination<-paste0("~/Projects/Projects/fad_ade/",id)

#file extraction
mapply(function(x, y) download.file(x,y, mode="wb"),x = zips_iii, y = destination)


Share : facebook icon twitter icon
Preventing overwriting of files when using save() or save.image()

Preventing overwriting of files when using save() or save.image()


By : bimplebean
Date : March 29 2020, 07:55 AM
I wish this helpful for you Use file.exists() to test if the file is there, and if it is, append a string to the name.
Edit:
code :
SafeSave <- function( ..., file=stop("'file' must be specified"), overwrite=FALSE, save.fun=save) {
  if ( file.exists(file) & !overwrite ) stop("'file' already exists")
  save.fun(..., file=file)
}
Downloading multiple files writes same data to all files

Downloading multiple files writes same data to all files


By : Asif
Date : March 29 2020, 07:55 AM
Does that help Yes, this is what I mentioned in my comment. Each time connectionDidFinishLoading: is called, you've got the result of just one connection. If you loop through all the file names, you will write that same chunk of data out to all those names, repeatedly. Each time through the for loop in parsingComplete: you create a new connection, get a new data object, and then write that same object out multiple times. After the end of the parsing... loop, you're left with a list of files all with the data from the last connection.
I'm pretty tired and I'm not sure: am I being clear?
code :
/* In parsingCompleted: */
for (int x = 0; x < [catArray count]; x++) 
{
    /*  download each file to the corresponding category sub-directory  */
    // fileOut is an instance variable
    fileOut = [NSString stringWithFormat:@"%@/%@_0%i.jpg",cat,catName,x];
    imageRequest = [NSURLRequest etc...
- (void)connectionDidFinishLoading:(NSURLConnection *)connection {
    // No loop; just use that file name that you set up earlier; 
    // it correctly corresponds to the current NSURLConnection
    [receivedData writeToFile:fileOut atomically:YES];
Downloading multiple files from an FTP server files using Libcurl

Downloading multiple files from an FTP server files using Libcurl


By : Chris Maissan
Date : March 29 2020, 07:55 AM
With these it helps I have used the following code to get to download all the files from the FTP Server , The file when downloading needs to be opened as a binary file
code :
fp = fopen(ofname,"wb");
Downloading multiple files at the same time and continue after all files finished downloading

Downloading multiple files at the same time and continue after all files finished downloading


By : iBilgee
Date : March 29 2020, 07:55 AM
this will help You can store all the tasks in a collection and then call Task.WaitAll(yourArray); Your code will be blocked until all tasks complete. Something like this:
code :
var tasks=new List<Task>();
foreach (var File in ServerFiles)
{
    string sFileName = File.Uri.LocalPath.ToString();
    // some internal logic and initialization 
    Task downloadTask = oBlob.DownloadToStreamAsync(fileStream);
tasks.Add(downloadTask);
    sFiles += sFileName.Replace("/" + Container + "/", "") + ",";
}
Task.WaitAll(tasks);
//Continue here
Save without overwriting current files

Save without overwriting current files


By : Saif
Date : March 29 2020, 07:55 AM
I think the issue was by ths following , I have just found myself needing a solution to the same problem as here, with a little more experience now, I have been able to solve it myself. I thought I may as well post how I did it in case anyone ever needs it.
I found the following function online, to search the directories:
code :
Function IsFile(ByVal fName As String) As Boolean
'Returns TRUE if the provided name points to an existing file.
'Returns FALSE if not existing, or if it's a folder
    On Error Resume Next
    IsFile = ((GetAttr(fName) And vbDirectory) <> vbDirectory)
End Function
...
TryAgain:
    ...
    Opendialog = Application.GetSaveAsFilename("", filefilter:="PDF Files (*.pdf), *.pdf", _
                                        Title:="Your Doc")
    'if no value is added for file name
    If Opendialog = False Then
        MsgBox "The operation was not successful"
        Exit Sub

    End If
    If IsFile(Opendialog) = True Then
        MsgBox "File Already Exists"
    Opendialog = ""
    End If

If Opendialog = "" Then
    GoTo TryAgain
End If
Related Posts Related Posts :
  • how to loop for division funciton in r
  • Why does ggplot not allow suppressing of messages generated by its geoms?
  • Download multiple excel files linked through urls in R
  • sparklyr : spark_apply function is not working in cluster mode
  • dplyr mutate - How do I pass one row as a function argument?
  • R selecting rows by conditions given in an external table
  • Native regex way to replace multiple leading chars with equal number spaces
  • stan - difficulty vectorizing
  • How to define a function that calls shiny functions?
  • How to count number of observations in a "n" dimensional range in R
  • Superimposing asymmetric t-distribution using ggplot2
  • Makefile to render all targets of all .Rmd files in directory
  • Authentication failure with rdrop2
  • DT data table display error
  • Issue when adding new rows (with nested dataframes within) to a dataframe
  • R-How to compare two dataframe and update list column value
  • Series vector for approximating pi
  • what is difference between "variance explained " in Random Forest and "merror" in XGBoost
  • R - Cast dataframe on unique rows - reshape2
  • ggplot2: plot correct proportions using geom_bar
  • Speedup query for R data.table - can this two-argument function be applied by group more quickly?
  • apply a function to several columns at once with mutate
  • R 'cowplot' neatly produce gridded plot with shared (common) legends and unique legends
  • Repeat R script for many times and save results to text file
  • How to negative lookbehind for special characters
  • data.table inner join produces error when no match is found
  • Create a new column base on existing column, but row above
  • Is there a way to visualize the process of source() in RStudio?
  • google places api consumes 10 request but I am doing only 1
  • Statistical mode of a categorical variable in R (using mlv)
  • Using for-loop to mutate a data.frame in r
  • Make plot with regression line for mixed model
  • Shortcut to select matces cases in R studio
  • vectoriced norm/matrix multiplication
  • Negative log10 transformation in R
  • Plot data with duplicate points
  • Visualizing crosstab tables with a plot in R - changing colours
  • How to manually modify automated numbers and labels in plot
  • How can I follow any redirections of a url in R?
  • Add jitter to box plot using markers in plotly
  • Adding an extra item to the legend
  • ggplot fills in data in the wrong order
  • Convert list to data frame
  • R: filtering by list(s) of strings and returning all results that start with the content of the lists
  • R:How to attach parts of a data frame with different headers and/or an overflowing piece of the dat frame
  • How to use 'par' for manipulating plot margins?
  • Can dplyr::case_when return mix of NAs and non-NAs?
  • Text preprocessing and topic modelling using text2vec package
  • Uploading multiple files in Shiny, process the files, rbind the results and return a download
  • R levelplot: color green-white-red (white on 0) according to one variable, but show the values of another variable
  • Why [i] doesn't point to the starting point in a vector
  • In R after generating a mvrnorm distribution, Y, what does Y[,1] do?
  • expand a data frame to have as many rows as range of two columns in original row
  • Getting started with R and CFA
  • Re order x-axis in ggplot so time goes from 12AM to 11PM in R
  • R - Automatically stack every nth column of a data frame and save them as new objects
  • How to format dplyr output in R into doubles (or other workable format)?
  • Dataframe to matrix conversion using tapply turns zeros to NAs
  • Smallest multiple of 1:20 - How can I make it quicker?
  • How to specify the size of a graph in ggplot2 independent of axis labels
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org