logo
Tags down

shadow

Python Pandas Threading reads read_fwf


By : Kyehani
Date : October 16 2020, 08:10 PM
With these it helps I did manage to get this to work and fix my problems by using the multiprocessing package. I ran into two issues.
1) multiprocessing package is not compatible with Juypter Notebook
code :


Share : facebook icon twitter icon

I need your help about read_fwf in python pandas


By : user3398337
Date : March 29 2020, 07:55 AM
will be helpful for those in need Let me take a stab in the dark: you wish to turn this table into a column (aka "vertical category"), ignoring the other columns?
I didn't have your precise text, so I guesstimated it. My column widths were different than yours ([11,21,31]) and I omitted the nwors argument (you probably meant to use nrows, but it's superfluous in this case). While the column spec isn't very precise, a few seconds of fiddling left me with a workable DataFrame:
code :
df.columns = list(df.loc[0])
df = df.ix[2:6]
df['Chapter']
2    1-1
3    1-2
4    1-3
5    1-4
6    1-5
Name: Chapter, dtype: object
list(df['Chapter'])
['1-1', '1-2', '1-3', '1-4', '1-5']

Pandas read_fwf: specify dtype


By : Huiying Li
Date : March 29 2020, 07:55 AM
I wish this helpful for you The converter parameter can be used to preserve the data as strings since pd.read_fwf does not try to guess the dtype if a converter is specified:
code :
import pandas as pd
try:
    # for Python2
    from cStringIO import StringIO 
except ImportError:
    # for Python3
    from io import StringIO

content = '''\
1.0    2    A
3.0    4    B
5      X    C
M      Y    D
'''
header = ['foo', 'bar', 'baz']

for df in pd.read_fwf(StringIO(content), header=None, chunksize=2, names=header,
                      converters={h:str for h in header}):
    print(df)
df.info()
   foo bar baz
0  1.0   2   A
1  3.0   4   B

  foo bar baz
0   5   X   C
1   M   Y   D

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2 entries, 0 to 1
Data columns (total 3 columns):
foo    2 non-null object
bar    2 non-null object
baz    2 non-null object
dtypes: object(3)
memory usage: 120.0+ bytes

Pandas read_fwf ignores columns


By : Zdzisław Zajadek
Date : March 29 2020, 07:55 AM
Does that help I have a .asc file where each line has 655 entries and looks somewhat like the following (note the leading whitespace) , UPDATE2: using colspecs parameter when calling read_fwf()
code :
In [83]: df = pd.read_fwf(fn, skiprows=6, header=None, na_values=[-999],
   ....:                  colspecs=[(5,6)] * 654)


In [84]: df.head()
Out[84]:
   0    1    2    3    4    5    6    7    8    9   ...   644  645  646  647  648  649  650  651  652  653
0  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
1  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
2  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
3  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
4  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN

[5 rows x 654 columns]
In [61]: fn = r'D:\download\BRD_8110_YY_GIS.asc'

In [62]: df = pd.read_csv(fn, skiprows=6, header=None, na_values=[-999], delim_whitespace=True)

In [63]: df.head()
Out[63]:
   0    1    2    3    4    5    6    7    8    9   ...   644  645  646  647  648  649  650  651  652  653
0  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
1  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
2  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
3  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN
4  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN ...   NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN

[5 rows x 654 columns]

read_fwf in pandas in Python does not use comment character if colspecs argument does not include first column


By : Baterdene Batchuluun
Date : March 29 2020, 07:55 AM
this will help I think this is a bug, the spec in the documentation says "if the line starts with a comment then the entire line is skipped". The problem is that columns are subsetted by FixedWidthReader.__next__ before they are checked for comments (in PythonParser or CParserWrapper). The relevant code is in io/parsers.py.

Pandas read_fwf


By : Nir Yaron
Date : March 29 2020, 07:55 AM
With these it helps Is the problem only with the D-"row", which has its text split into 3 rows? If so, you can do:
code :
s = """
    0 2017   IX 2018       X 2018       X 2018       X 2018        0 2017   IX 2018   X 2018    X 2018   X 2018
                                                                                             0 2017       IX 2018                                    0 2017   IX 2018

    UKUPNO                                               1.053    1.075         1.093        103,8        101,7         1.633    1.669     1.701      104,2    101,9
A   Poljoprivreda, šumarstvo i ribolov                     907      888           925        102,0        104,2         1.394    1.356     1.420      101,9    104,7
B   Vađenje ruda i kamena                                  913      919           839         91,9         91,3         1.395    1.406     1.297       93,0     92,2
C   Prerađivačka industrija                                769      764           775        100,8        101,4         1.176    1.169     1.187      100,9    101,5
D   Proizvodnja i snabdijevanje                          1.574    1.570         1.647        104,6        104,9         2.459    2.455     2.579      104,9    105,1
    električnom energijom, plinom,
    parom i klimatizacija
"""

df = pd.read_fwf(pd.compat.StringIO(s), header=None, skiprows=5)

df[0] = df[0].ffill()
df[1] = df[0].map(df[1].groupby(df[0]).agg(lambda x: ' '.join(x)))
df = df.dropna(axis=0)
df
    0   1   2   3   4   5   6   7   8   9   10  11
0   A   Poljoprivreda, šumarstvo i ribolov  907.000 888.00  925.000 102,0   104,2   1.394   1.356   1.420   101,9   104,7
1   B   Vađenje ruda i kamena   913.000 919.00  839.000 91,9    91,3    1.395   1.406   1.297   93,0    92,2
2   C   Prerađivačka industrija 769.000 764.00  775.000 100,8   101,4   1.176   1.169   1.187   100,9   101,5
3   D   Proizvodnja i snabdijevanje električnom energi...   1.574   1.57    1.647   104,6   104,9   2.459   2.455   2.579   104,9   105,1
Related Posts Related Posts :
  • String Manipulation Recursive Function
  • Filter after Groupby and Sum in pandas?
  • writing a custom function Multiply the average of x,y
  • Spotify API fetch authorization code from redirect_uri
  • sklearn use RandomizedSearchCV with custom metrics and catch Exceptions
  • IndexOutOfRange error when filling a List Python
  • sns stripplot with just top n number of categories
  • Python classes keep calling eachother
  • How do I create a Dataframe_new in python from an existing Dataframe_old.
  • calculating an intercept point between a straight line and an ellipse - python
  • Integrating Tensorflow object detection with keras cnn classifier
  • How to skip comma while reading CSV file in python?
  • Stop Integrating when Output Reaches 0 in scipy.integrate.odeint
  • Changing the current graph of tf.placeholder objects in Tensorflow: Is it possible?
  • Logical error in while statement when used with or operator
  • django-rest-framework: int() argument must be a string, a bytes-like object or a number, not Deferred Attribute
  • how to remove a whitespace in a list in python?
  • How to reduce the number of row repetitions in a numpy array
  • Python: Dividing values of nested list with values with values of dictionary
  • Printing empty Pyramid
  • Python: How to save log file toSharePoint
  • Python Pandas count most frequent occurrences
  • How can I store / cache values from methods in a class for later use in other methods of the same class?
  • Sklearn: Pass class names to make_scorer
  • PyTorch - applying attention efficiently
  • How do I capitalize each parameter in a function definition using Python?
  • Regex matching of a bytes pattern gives unusual results - '.' not equivalent to [\x00-\xff]
  • I need help converting this REST API Curl command to Python requests
  • How do you make a variable comparison to decide a better score in a dice game?
  • How do I run sumo-gui on instant-veins-4.7.1-i1.ova
  • Deal with NAN values when creating models with python
  • Python requests: having a space in header for posting
  • Adding a column to a pandas dataframe based on cell values
  • Get mongod rs.status() results from a python script
  • ImportError: C extension: No module named 'parsing' not built
  • python pandas update column values related to previous updated row during iteration over it
  • 3 nested loops: Optimizing a simple simulation for speed
  • Assign subset of values to pandas dataframe with MultiIndex
  • How to group two sets of buttons on each top corner of the screen using Tkinter?
  • django login using class based for custom user
  • MRJob sort reducer output
  • Python Pandas Counts using rolling time window
  • Getting or editing a string from a column in a csv file with pandas
  • Python - Delete row in matrix/array if row contains
  • Using dicom Images with OpenCV in Python
  • Odoo ghost record
  • Creating and assigning multiple variables in a tkinter application
  • Graph dictionary
  • No changes to original dataframe after applying loop
  • AUC of Random forest model is lower after tuning parameters using hypergrid search and CV with 10 folds
  • Python: Reading multiple CSV files, and assigning each to a different variable
  • How to identify empty rectangle using OpenCV
  • How to iterate multilevel dataframe in python
  • How to limit the contour plot with a line plot?
  • Why subclassing a str or int behaves differently from subclising a list or dict?
  • Python decode with translation table
  • i need to click unordered links in the below URL using selenium, python
  • How to join pandas dataframe with itself?
  • How to apply a color cast to a video frame in OpenCV Python?
  • Is there any existing library for median filtering with kernel size greater then 5 using OpenCL acceleration in python?
  • shadow
    Privacy Policy - Terms - Contact Us © voile276.org