How to extract numbers from a string

Question

5.00/5 (1 vote)

See more:

I want to extract only a number from the following:

RTGS-VEERPAL/KARBH13267941306
TO STAMP PAPER CHARGES
2013
1348
By Clg/33790/ICICI/DELCTS1/166
RTGS-KUSUM TOMAR/CNRBH13271696922
RTGS-SHRI BALAJI BUILDHOME PVT /KARBH13294076078
By Clg/956226/AXIS/DELCTS1/171

What I have tried:

I've tried using the regular expression

[\d]{4,}

although it can extract the numbers, it also extracts numbers from the alphanumeric portion, which is not what I want.

I want

Posted 1-Jan-23 3:54am

Member 15881193

Updated 11-Jan-23 20:23pm

Add a Solution

Comments

0x01AA 1-Jan-23 11:49am

So what you have to do first is to define all the rules. So far I can see them it is
a.) Numbers in a line
b.) Numbers delimited by '/' (except the number is at end of line?)
c.) What more? E.g. delimited by whitechars like space, tab, ...? E.g. 'AlphaPart 3051 ,'

adriancs 1-Jan-23 23:36pm

import re, then re.findall(r'\d+', datasource_string)

3 solutions

Solution 1

#STRING DATA MINING
def main():
    s = '''RTGS-VEERPAL/KARBH13267941306
TO STAMP PAPER CHARGES
2013
1348
By Clg/33790/ICICI/DELCTS1/166
RTGS-KUSUM TOMAR/CNRBH13271696922
RTGS-SHRI BALAJI BUILDHOME PVT /KARBH13294076078
By Clg/956226/AXIS/DELCTS1/171'''
    L = s.split('\n')
    N = []
    for e in L:
        N.append(e.split('/'))
    print(s, '')
    #print('--- WANTED NUMBERS ---')
    print(int(N[2][0]))
    print(int(N[3][0]))
    print(int(N[4][1]))
    print(int(N[7][1]))
    
if __name__ == '__main__':
    main()
    print('*** STRING DATA MINING OVER ***')

Posted 1-Jan-23 8:22am

giocip

Updated 1-Jan-23 8:24am

v2

Solution 3

Python

#STRING DATA MINING BY REGEX
import re

def main():
    s = '''RTGS-VEERPAL/KARBH13267941306
TO STAMP PAPER CHARGES
2013
1348
By Clg/33790/ICICI/DELCTS1/166
RTGS-KUSUM TOMAR/CNRBH13271696922
RTGS-SHRI BALAJI BUILDHOME PVT /KARBH13294076078
By Clg/956226/AXIS/DELCTS1/171'''
    L = s.split('\n')
    M = []
    for n in L:
        regex = '\d+'
        match = re.search(regex, n)
        M.append(match.group() if match else None)
    WANTED = [int(w) for w in M if w and int(w) <= 1_000_000_000]
    print('--- WANTED NUMBERS ---')
    for w in WANTED:
        print(w)
    
if __name__ == '__main__':
    main()
    print('*** STRING DATA MINING OVER ***')

Posted 11-Jan-23 20:23pm

giocip

Comments

CHill60 12-Jan-23 5:48am

You have posted two different solutions - both without any commentary - which one is meant to be the solution? Don't post multiple answers to a question - if you need to update a solution then use the "Improve Solution" link on your first one.

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Zak River · Accepted Answer · 2023-01-11T12:35:00

Try running this example