AttributeError: module 'torchtext' has no attribute 'legacy' - torchtext

I am trying to use torchtext to process test data, however, I get the error: "AttributeError: module 'torchtext' has no attribute 'legacy'", when I run the following code. Can anyone please guide me what the issue here? I am using python 3.10.4. Thanks
import pandas as pd
import torch
import torchtext
import spacy
def prep_data(file_path):
TEXT=torchtext.legacy.data.Field(tokenize='spacy', tokenizer_language='en_core_web_sm')
LABEL=torchtext.legacy.data.LabelField(dtype=torch.long)
fields=[('clean_text', TEXT), ('label',LABEL)]
dataset = torchtext.legacy.data.TabularDataset(
path=file_path, format='csv',
skip_header=True, fields=fields)
print(dataset.examples[0])
if __name__=="__main__":
train_path='./data/train.csv'
test_path='./data/test.csv'
prep_data(train_path)

I addressed the same issue by updating the torchtext.
pip install torchtext==0.9

I also had the same issue. I solved my problem by using a pytorch stable version You are probably using versions 0.10, and 0.11. These were the versions using legacy.
Please update to the latest versions 0.13 and 0.14.
pip install torchtext==<version>

Related

ImportError: cannot import name 'escape' from 'cgi'

I'm getting the error message "ImportError: cannot import name 'escape' from 'cgi'" when I try to use following code in pycharm:
import nltk
parser = nltk.ChartParser(grammar, trace=0)
for tree in parser.parse(sent):
print(tree)
tree.pretty_print(unicodelines=True)
What should I do to correct it?
cgi.escape() has been removed in python 3.8. Quoting from here,
parse_qs, parse_qsl, and escape are removed from the cgi module. They
are deprecated in Python 3.2 or older. They should be imported from
the urllib.parse and html modules instead.
Since you are importing a third-party module, try using a lower python version.
I have updated the supervisor package version into:
supervisor==4.1.0
https://pypi.org/project/supervisor/4.1.0/
[Fixed a Python 3.8 compatibility issue caused by the removal of cgi.escape(). Patch by Mattia Procopio.]
problem solved.
You can use html.escape instead of cgi.escape.
It worked for me

How to use importlib.metadata from Python 3.8

I've been trying to understand the importlib.metadata library from Python 3.8 but can't seem to figure why it won't work.
As per the documentation https://docs.python.org/3.8/library/importlib.metadata.html, after installing Python3.8 and wheel package (via pip):
>> from importlib.metadata import version
>> version('wheel')
ImportError: cannot import name 'MetadataPathFinder' from 'importlib.metadata'
Running the following command helped in my case python -c "import importlib.metadata, shutil, pathlib; file = pathlib.Path(importlib.metadata.__file__); str(file).endswith('__init__.py') and shutil.rmtree(file.parent) and print('removed', file.parent)"
Taken from https://bugs.python.org/issue38342#msg353736

Error when running pyspark

I tried to run pyspark via terminal. From my terminal, I runs snotebook and it will automatically load jupiter. After that, when I select python3, the error comes out from the terminal.
[IPKernelApp] WARNING | Unknown error in handling PYTHONSTARTUP file
/Users/simon/spark-1.6.0-bin-hadoop2.6/python/pyspark/shell.py
Here's my .bash_profile setting:
export PATH="/Users/simon/anaconda/bin:$PATH"
export SPARK_HOME=~/spark-1.6.0-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'
export PYSPARK_PYTHON=python3
alias snotebook='$SPARK_HOME/bin/pyspark'
Please let me know if you have any ideas, thanks.
You need to add below line in your code
PYSPARK_DRIVER_PYTHON=ipython
or
PYSPARK_DRIVER_PYTHON=ipython3
Hope it will help.
In my case, I was using a virtual environment and forgot to install Jupyter, so it was using some version that it found in the $PATH. Installing it inside the environment fixed this issue.
Spark now includes PySpark as part of the install, so remove the PySpark library unless you really need it.
Remove the old Spark, install latest version.
Install (pip) findspark library.
In Jupiter, import and use findspark:
import findspark
findspark.init()
Quick PySpark / Python 3 Check
import findspark
findspark.init()
from pyspark import SparkContext
sc = SparkContext()
print(sc)
sc.stop()

SVG support missing in PySide 1.2.1

With PySide 1.2.1 installed through the Canopy Package Manager, I get the following set of supported image formats:
>>> from PySide import QtGui
>>> QtGui.QImageReader.supportedImageFormats()
[PySide.QtCore.QByteArray('bmp'),
PySide.QtCore.QByteArray('pbm'),
PySide.QtCore.QByteArray('pgm'),
PySide.QtCore.QByteArray('png'),
PySide.QtCore.QByteArray('ppm'),
PySide.QtCore.QByteArray('xbm'),
PySide.QtCore.QByteArray('xpm')]
If I downgrade to PySide 1.1.0, I get the following:
>>> from PySide import QtGui
>>> QtGui.QImageReader.supportedImageFormats()
[PySide.QtCore.QByteArray('bmp'),
PySide.QtCore.QByteArray('gif'),
PySide.QtCore.QByteArray('ico'),
PySide.QtCore.QByteArray('jpeg'),
PySide.QtCore.QByteArray('jpg'),
PySide.QtCore.QByteArray('mng'),
PySide.QtCore.QByteArray('pbm'),
PySide.QtCore.QByteArray('pgm'),
PySide.QtCore.QByteArray('png'),
PySide.QtCore.QByteArray('ppm'),
PySide.QtCore.QByteArray('svg'),
PySide.QtCore.QByteArray('svgz'),
PySide.QtCore.QByteArray('tif'),
PySide.QtCore.QByteArray('tiff'),
PySide.QtCore.QByteArray('xbm'),
PySide.QtCore.QByteArray('xpm')]
Is there some extra configuration required to restore the missing formats?
I'm running Canopy v1.3.0.1715 on Mac OS X.
The extra image format handlers are distributed as Qt plugins, but it appears that Qt is not able to find them despite the presence of a qt.conf file. We'll get that fixed for a future release, but in the meantime you can workaround the issue by setting the QT_PLUGIN_PATH variable in the environment. For example:
export QT_PLUGIN_PATH=/Applications/Canopy.app/appdata/canopy-1.3.0.1715.macosx-x86_64/Canopy.app/Contents/plugins
[edit]
Actually the plugins fodler is properly found after the application object has been created:
>>> from PySide import QtCore, QtGui
>>> app = QtCore.QCoreApplication([])
>>> import pprint
>>> pprint.pprint(QtGui.QImageReader.supportedImageFormats())
[PySide.QtCore.QByteArray('bmp'),
PySide.QtCore.QByteArray('gif'),
PySide.QtCore.QByteArray('ico'),
PySide.QtCore.QByteArray('jpeg'),
PySide.QtCore.QByteArray('jpg'),
PySide.QtCore.QByteArray('mng'),
PySide.QtCore.QByteArray('pbm'),
PySide.QtCore.QByteArray('pgm'),
PySide.QtCore.QByteArray('png'),
PySide.QtCore.QByteArray('ppm'),
PySide.QtCore.QByteArray('tga'),
PySide.QtCore.QByteArray('tif'),
PySide.QtCore.QByteArray('tiff'),
PySide.QtCore.QByteArray('xbm'),
PySide.QtCore.QByteArray('xpm')]
>>>
But the svg format still seems to be MIA. I'll check into that further.

Issue with importing scipy.integrate or scipy.integrate.quad

This might be something really simple: I am using Python 2.6.5 and I am unable to load any integration module in my working space. Everything is OK when I import scipy, but if I try to import scipy.integrate or scipy.integrate.quad I get an error message from python. Any clue?? Thanks.
Try from scipy import integrate.