R list within matrix to dataframe conversion
By : Ale Ale
Date : March 29 2020, 07:55 AM
fixed the issue. Will look into that further R struggles. I am using the following to extract quotations from text, with multiple results on a large datset. I am trying to have the output be a character string within a dataframe, so I can easily share this as an csv with others. , Code code :
normalCase < 'He said, "I am a test," very quickly.'
endCase < 'This is a long quote, which we said, "Would never happen."'
shortCase < 'A "quote" yo';
beginningCase < '"I said this," he said quickly';
multipleCase < 'When asked, "No," said Sam "I do not like green eggs and ham."'
testdata = c(normalCase,endCase,shortCase,beginningCase,multipleCase)
# extract quotations
gsub(pattern = "[^\"]*((?:\"[^\"]*\")$)", replacement = "\\1 ", x = testdata)
[1] "\"I am a test,\" "
[2] "\"Would never happen.\" "
[3] "\"quote\" "
[4] "\"I said this,\" "
[5] "\"No,\" \"I do not like green eggs and ham.\" "

Tapply over matrix using matrix math
By : user3551941
Date : March 29 2020, 07:55 AM
Any of those help All, I have the following code, I'd like to make it generalized for more clusters, ie C clusters. Is there a way to do this without a loop? Here, the rows of X correspond to variables x1,x2, and T is a linear transformation of X. . , Could try doing this via tapply code :
tapply(seq_len(ncol(X)), cluster, function(x) f(T%*%X[, x]))
# 0 1
# 3840.681 1238.826

Using tapply on data with counts to add zeros and NAs
By : krunal shah
Date : March 29 2020, 07:55 AM
may help you . I have a DB composed of: Species ID (as factor), counts, site, visit, year. Find a subset in here [Google Drive] , Given your data is in data.frame df code :
library(reshape2)
tmp < dcast(df, site + visit + year ~ species, value.var = 'counts', fill = 0)
df < melt(tmp, id.vars = c('site', 'visit', 'year'), variable.name = 'species', value.name = 'counts')
y < tapply(df$counts, list(df$species, df$site, df$visit, df$year), sum)

Conversion of pandas dataframe to sparse keyitem matrix with composite key
By : Doron Fital
Date : March 29 2020, 07:55 AM
This might help you I have a data frame of 3 columns. Col 1 is a string order number, Col 2 is an integer day, and Col 3 is a product name. I would like to convert this into a matrix where each row represents a unique order/day combination, and each column represents a 1/0 for the presence of a product name for that combination. , Here's a NumPy based approach to have an array as output  code :
a = df[['ord_nb','day']].values.astype(int)
row = np.unique(np.ravel_multi_index(a.T,a.max(0)+1),return_inverse=1)[1]
col = np.unique(df.prd.values,return_inverse=1)[1]
out_shp = row.max()+1, col.max()+1
out = np.zeros(out_shp, dtype=int)
out[row,col] = 1
from scipy.sparse import coo_matrix
d = np.ones(row.size,dtype=int)
out_sparse = coo_matrix((d,(row,col)), shape=out_shp)
In [232]: df
Out[232]:
ord_nb day prd
0 1 1 A
1 1 1 B
2 1 2 B
3 1 2 C
4 1 2 D
In [233]: out
Out[233]:
array([[1, 1, 0, 0],
[0, 1, 1, 1]])
In [241]: out_sparse
Out[241]:
<2x4 sparse matrix of type '<type 'numpy.int64'>'
with 5 stored elements in COOrdinate format>
In [242]: out_sparse.toarray()
Out[242]:
array([[1, 1, 0, 0],
[0, 1, 1, 1]])

My input matrix does not contain zeros but i get zeros when I print the matrix.The problem description is given below
By : user3558549
Date : March 29 2020, 07:55 AM

