Click here to Skip to main content
15,885,278 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I am altering a database with approximately 500 html pages using phpmyadmin. Several pages contain a Facebook Pixel or Google Tag that I would like to remove.

The easiest way I thought would be to search via regex the entire tag that contains some expression or term related to Facebook or Google, and replace it with blank.

An example would be


HTML
<script>
    window.dataLayer = window.dataLayer || [];

    function gtag() {
      dataLayer.push(arguments);
    }
    gtag('js', new Date());
    gtag('config', 'G-XXXXXXXX');
  </script>



or

HTML
<script>
(window, document, 'script', 'https://connect.facebook.net/en_US/fbevents.js');
    fbq('init', '9999999999999999');
    fbq('track', 'salespage_xxxxxx');
  </script>


Although all are unique, some have the same code or another element that makes it possible to identify each one of them.

Before running in myphpadmin, I'm trying to formulate the expression using SublimeText3

It's the first contact I have with the regex and I found it fascinating, but even following some references I can't match the search.

The expression I came up with after some research was

<(.*)>[\s\S]face[\s\S]<\/(.*)>

Where I thought the expression would select the entire tag containing the word "face", but it doesn't find anything.

I would like some help.

If it works, it would be able to make several other necessary changes.

What I have tried:

(.*)>[\s\S]face[\s\S]<\/(.*)
Posted
Updated 25-Nov-21 20:19pm
v2

1 solution

To be honest, Regex is not a good tool for this: it's a text processor, and that means it's great at processing well formed text, but not so great at processing syntax.
If you are going to play with Regexes, then get a copy of Expresso[^] - it's free, and it examines, tests, and generates Regular expressions.

And the Regex you show doesn't even come close to finding anything in your examples:
A numbered capture group
   Any char, any number of repetitions (Including zero)
A literal ">" followed by whitespace or non whitespace
The literal "face"
Another whitespace-or-not-whitespace character
A literal "<" followed by a literal "/"
A numbered capture group
   Any char, any number of repetitions (Including zero)

Which frankly is garbage, and even if it did work - which it won't - would match the entire document rather than just a fragment of it!

Instead of that, look at an HTML parser: php html processor - Google Search[^]
I haven't used any of them - I do all my HTMNL stuff in C# - but something there will work better than any regex!
 
Share this answer
 
Comments
Maciej Los 26-Nov-21 13:23pm    
5ed!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900