How do i...valueerror: invalid literal for int() with base 10: 'complete opposite ergonomic design'

Question

0.00/5 (No votes)

See more:

This is input dataset:

Python

<code></code>
| text     | term_index     |
| -------- | -------------- |
| liked aluminum body|[0, 1, 1]|
| lightweight screen beautiful|[0, 1, 0]|

This is my code

Python

<pre>spacy_eg = spacy.load('en') 

def tokenize_eng(text):
    return [tok.text for tok in spacy_eg.tokenizer(text)]
TEXT =  Field(sequential = True, use_vocab = True, 
                    tokenize = tokenize_eng, lower = True,
                    init_token = '<s>', eos_token = '</s>', fix_length =104) 
INDEX = Field(sequential = False, use_vocab = False, init_token = '<s>', eos_token = '</s>', fix_length =104)
fields1= [('text', TEXT), ('term_idex', INDEX)]
train= TabularDataset(path ='/content/trial.csv', format = 'csv', fields = fields1) 

TEXT.build_vocab(train, vectors=GloVe(name="6B", dim=300), max_size = 10000, min_freq =2) 
dataset_iter = Iterator(
        train, batch_size=10,
        train=True, shuffle= True)
for batch in dataset_iter:
    batch.text[0]
    batch.term_index[0]

Python

run> 
ValueError                                Traceback (most recent call last)
<ipython-input-35-51f333c9c2b1> in <module>()
---->1 for batch in dataset_iter: 
2     batch.text[0]
3     batch.term_index[0]

ValueError: invalid literal for int() with base 10: 'complete opposite ergonomic design'

How can I deal with it?

What I have tried:

I have checked Field function several times; still cannot solve it.

Posted 20-Jul-21 3:10am

Linlin Zeng

Updated 20-Jul-21 4:04am

Add a Solution

Comments

Linlin Zeng 21-Jul-21 5:03am

TEXT.vocab.vectors.size() # torch.Size([4, 300])
I suppose when building vocab, there is something wrong
'''
import spacy
import pandas as pd
from torchtext.legacy.data import Field, TabularDataset, BucketIterator, LabelField
from sklearn.model_selection import train_test_split
from torchtext.vocab import GloVe
'''
This is all packages I used; not import bird

2 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

OriginalGriff · Answer 1 · 2021-07-20T03:32:00

Getting your code to run does not mean it is right! :laugh:
Think of the development process as writing an email: compiling successfully means that you wrote the email in the right language - English, rather than German for example - not that the email contained the message you wanted to send.

So now you enter the second stage of development (in reality it's the fourth or fifth, but you'll come to the earlier stages later): Testing and Debugging.

Start by looking at what it does do, and how that differs from what you wanted. This is important, because it give you information as to why it's doing it. For example, if a program is intended to let the user enter a number and it doubles it and prints the answer, then if the input / output was like this:

Input   Expected output    Actual output
  1            2                 1
  2            4                 4
  3            6                 9
  4            8                16

Then it's fairly obvious that the problem is with the bit which doubles it - it's not adding itself to itself, or multiplying it by 2, it's multiplying it by itself and returning the square of the input.
So with that, you can look at the code and it's obvious that it's somewhere here:

C#

int Double(int value)
   {
   return value * value;
   }

Once you have an idea what might be going wrong, start using the debugger^* to find out why. Put a breakpoint on the first line of the method, and run your app. When it reaches the breakpoint, the debugger will stop, and hand control over to you. You can now run your code line-by-line (called "single stepping") and look at (or even change) variable contents as necessary (heck, you can even change the code and try again if you need to).
Think about what each line in the code should do before you execute it, and compare that to what it actually did when you use the "Step over" button to execute each line in turn. Did it do what you expect? If so, move on to the next line.
If not, why not? How does it differ?
Hopefully, that should help you locate which part of that code has a problem, and what the problem is.
This is a skill, and it's one which is well worth developing as it helps you in the real world as well as in development. And like all skills, it only improves by use!

* pdb — The Python Debugger — Python 3.9.6 documentation[^]

Richard MacCutchan · Answer 2 · 2021-07-20T04:04:00

Solution 2

Python

for batch in dataset_iter:
    batch.text[0]
    batch.term_index[0]

What are these statements supposed to do, they do not make much sense as Python. Are you using some third party library which you have not mentioned?

Posted 20-Jul-21 4:04am

Richard MacCutchan

Comments

Linlin Zeng 21-Jul-21 3:48am

dataset_iter = Iterator(
train, batch_size=10,
train=True, shuffle= True)
here dataset_iter is an iteration of train after using batch size; train is a tabular dataset after using Field to transfer word into tensors.

Richard MacCutchan 21-Jul-21 4:17am

Well that is fairly obvious, but it does not answer my question. What is the statement batch.text[0] supposed to do? And what library are you using, I could not find any definition of Field, is it a class or method, and if so where does it come from?

Linlin Zeng 21-Jul-21 5:23am

Sorry for misunderstanding your reply; I am using following packages;

'''import spacy
import pandas as pd
from torchtext.legacy.data import Field, TabularDataset, BucketIterator, LabelField
from sklearn.model_selection import train_test_split
from torchtext.vocab import GloVe'''

for batch in dataset_iter:
batch.text[0]
batch.term_index[0]
it suppose to print:
torchtext.data.batch.Batch of size 10
.text torch.cuda.LongTensor of size 104*10
.term_index torch.cuda.LongTensor of size 104*10;
just tell the batch can be run successfully

Richard MacCutchan 21-Jul-21 5:56am

OK, well you need to check the documentation for the class that creates that iterator, to find out exactly what sort of object is returned as batch. As it stands text[0] and term_index[0] do not look like methods that will print anything.

How do i...valueerror: invalid literal for int() with base 10: 'complete opposite ergonomic design'

2 solutions

Solution 1

Solution 2

Add your solution here

Preview 0

Existing Members

...or Join us