Select the search type
  • Site
  • Web
Search

News Search

News Calendar

«October 2024»
MonTueWedThuFriSatSun
30123456
78910111213
14151617181920
21222324252627
28293031123
45678910

News Summaries

Posted: Aug 12, 2023
Categories: Uncategorized
Comments: 0
Author: Sales

Ajs Blast

Read more
Posted: May 3, 2020
Categories: SEO
Comments: 0

Web design New York

Google Suggests Links in Primary Content Hold More Value

Google’s John Mueller recently spoke on the topic of placing links and where on the page they hold the most value.

This was discussed during a Google Webmaster Central hangout on May 1 where the following question was submitted:

Read more
Posted: Apr 23, 2020
Categories: SEO
Comments: 0

Affordable Web Design NY

How SEOs Can Help...

SEO Consulting Marketing Industry Whiteboard Friday

As SEOs, we hold a surprising amount of influence over how the world gets its information. In times like these, when businesses of all stripes are facing uncertainty and we may be looking for ways to help, the skills you use in your day job can be your superpower. 

Read more
Posted: Apr 23, 2020
Categories: Web Design
Comments: 0

Great web design New York

Brands with soul can cross borders and excel globally.

GETTY

Definitions of brands often seem as abundant as the brands themselves. In the solemn treatises on what attributes a brand should embody, soul has been absent… until Sanjay Khosla dove in.

Read more
Posted: Apr 23, 2020
Categories: Security
Comments: 0

Web design New York

As should be evident to anyone in the cyber security industry, the wide range of available web security solutions from commercial vendors will necessarily have varying degrees of effectiveness against different threats.

 

A premise of this article is that client-side security has been under-represented in these solutions – and to see this, it helps to briefly examine the specifics of the well-known web security solutions in use today, and their respective emphases.

Read more
Posted: Apr 18, 2020
Categories: Blog
Comments: 0

Great web design New York

Do you want your website to look like this in the search results?

1 rich snippets

Instead of this?

Read more
Posted: Apr 18, 2020
Categories: SEO
Comments: 0

Great web design New York

PPC and SEO are often handled separately. However, by combining these two tactics, retailers can obtain far greater business results and SERP outcomes.

Many e-commerce retailers consider their PPC and SEO marketing strategies to be separate entities. Sometimes, they are even viewed as alternatives to one another, with brands only investing in one at a time.

Of course, this is a tragic mistake as PPC and SEO strategies are necessary to integrate into one another for a site to reach its fullest potential in the SERPs.

Read more
Posted: Apr 18, 2020
Categories: SEO
Comments: 0

Web design New York

 

How to Produce Quality Titles & Meta Descriptions Automatically

Research Keywords with SEMrush Tools

Want to optimize your content for search engines? Learn how to do it with SEMrush?s ultimate keyword research guide.

Read more
Posted: Apr 18, 2020
Categories: SEO
Comments: 0

Great web design New York

An Introduction to Google RankBrain (And How It Can Affect Your SEO)

Coca Cola’s secret formula, KFC’s seasoning, and Google’s search algorithm. These are three of the most closely-guarded secrets in the world, and they all have one thing in common. We know what some of their components are, but no one understands enough to crack the entire thing open. You can try to optimize your website as much as possible for Google, but there’s no magic formula to beat its algorithm.

Read more
RSS
12345

Our Technical News

How to Produce Quality Titles & Meta Descriptions Automatically

Posted: Apr 18, 2020
Categories: SEO
Comments: 0

Web design New York

 

How to Produce Quality Titles & Meta Descriptions Automatically

Research Keywords with SEMrush Tools

Want to optimize your content for search engines? Learn how to do it with SEMrush?s ultimate keyword research guide.

These difficult times it is more important than ever to get more effective work done in less time and with fewer resources.

One boring and time-consuming SEO task that is often neglected is writing compelling titles and meta descriptions at scale.

In particular, when the site has thousands or millions of pages.

It is hard to put on the effort when you don’t know if the reward will be worth it.

In this column, you will learn how to use the most recent advances in Natural Language Understanding and Generation to automatically produce quality titles and meta descriptions.

We will access this exciting generation capability conveniently from Google Sheets. We will learn to implement the functionality with minimal Python and JavaScript code.

How to Produce Quality Titles & Meta Descriptions Automatically

Here is our technical plan:

  • We will implement and evaluate a couple of recent state of the art text summarization models in Google Colab
  • We will serve one of the models from a Google Cloud Function that we can easily call from Apps Script and Google Sheets
  • We will scrape page content directly from Google Sheets and summarize it with our custom function
  • We will deploy our generated titles and meta descriptions as experiments in Cloudflare using RankSense
  • We will create another Google Cloud Function to trigger automated indexing in Bing

Introducing Hugging Face Transformers

Hugging Face Transformers is a popular library among AI researchers and practitioners.

It provides a unified and simple to use interface to the latest natural language research.

It doesn’t matter if the research was coded using Tensorflow (Google’s Deep Learning framework) or Pytorch (Facebook’s framework). Both are the most widely adopted.

While the transformers library offers simpler code, it isn’t as simple enough for end users as Ludwig (I’ve covered Ludwig in previous deep learning articles).

That changed recently with the introduction of the transformers pipelines.

Pipelines encapsulate many common natural language processing use cases using minimal code.

They also provide a lot flexibility over the underlying model usage.

We will evaluate several state of the art text summarization options using transformers pipelines.

We will borrow some code from the examples in this notebook.

Facebook’s BART

When announcing BART, Facebook Researcher Mike Lewis shared some really impressive abstractive summaries from their paper.

Now, let’s see how easy it is to reproduce the results of their work using a transformers pipeline.

First, let’s install the library in a new Google Colab notebook.

Make sure to select the GPU runtime.

!pip install transformers

Next, let’s add this the pipeline code.

from transformers import pipeline

# use bart in pytorch
bart_summarizer = pipeline("summarization")

This is the example text we will summarize.

TEXT_TO_SUMMARIZE = """ 
New York (CNN)When Liana Barrientos was 23 years old, she got married in Westchester County, New York. 
A year later, she got married again in Westchester County, but to a different man and without divorcing her first husband. 
Only 18 days after that marriage, she got hitched yet again. Then, Barrientos declared "I do" five more times, sometimes only within two weeks of each other. 
In 2010, she married once more, this time in the Bronx. In an application for a marriage license, she stated it was her "first and only" marriage. 
Barrientos, now 39, is facing two criminal counts of "offering a false instrument for filing in the first degree," referring to her false statements on the 
2010 marriage license application, according to court documents. 
Prosecutors said the marriages were part of an immigration scam. 
On Friday, she pleaded not guilty at State Supreme Court in the Bronx, according to her attorney, Christopher Wright, who declined to comment further. 
After leaving court, Barrientos was arrested and charged with theft of service and criminal trespass for allegedly sneaking into the New York subway through an emergency exit, said Detective 
Annette Markowski, a police spokeswoman. In total, Barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002. 
All occurred either in Westchester County, Long Island, New Jersey or the Bronx. She is believed to still be married to four men, and at one time, she was married to eight men at once, prosecutors say. 
Prosecutors said the immigration scam involved some of her husbands, who filed for permanent residence status shortly after the marriages. 
Any divorces happened only after such filings were approved. It was unclear whether any of the men will be prosecuted. 
The case was referred to the Bronx District Attorney\'s Office by Immigration and Customs Enforcement and the Department of Homeland Security\'s 
Investigation Division. Seven of the men are from so-called "red-flagged" countries, including Egypt, Turkey, Georgia, Pakistan and Mali. 
Her eighth husband, Rashid Rajput, was deported in 2006 to his native Pakistan after an investigation by the Joint Terrorism Task Force. 
If convicted, Barrientos faces up to four years in prison. Her next court appearance is scheduled for May 18.
"""

Here is the summary code and resulting summary:

summary = bart_summarizer(TEXT_TO_SUMMARIZE, min_length=50, max_length=250)
print(summary) #Output: [{'summary_text': 'Liana Barrientos has been married 10 times, sometimes within two weeks of each other. Prosecutors say the marriages were part of an immigration scam. She is believed to still be married to four men, and at one time, she was married to eight at once.'}]

I specified the generated summary should not have less than 50 characters and at most 250.

This is very useful to control the type of generation: titles or meta descriptions.

Now, look at the quality of the generated summary and we only typed a few lines of Python code.

Super cool!

'Liana Barrientos has been married 10 times, sometimes within two weeks of each other. Prosecutors say the marriages were part of an immigration scam. She is believed to still be married to four men, and at one time, she was married to eight at once.'

print(len(summary[0]["summary_text"]))
#Output: 249

Google’s T5

Another state of the art model is Text-to-Text Transfer Transformer, or T5.

One impressive accomplishment of the model is that its performance got really close to the human level baseline in the SuperGLUE leaderboard.

This is notable because it NLP tasks in SuperGLUE are designed to be easy for humans but hard for machines.

Google recently published a summary writeup with all the details of the model for people less inclined to learn about it from the research paper.

Their core idea was to try every popular NLP idea on a new massive training dataset they call C4 (Colossal Clean Crawled Corpus).

I know, AI researchers like to have fun naming their inventions.

Let’s have use another transformer pipeline to summarize the same text, but this time using T5 as the underlying model.

t5_summarizer = pipeline("summarization", model="t5-base", tokenizer="t5-base")

summary = t5_summarizer(TEXT_TO_SUMMARIZE, min_length=50, max_length=250)

Here is the summarized text.

[{'summary_text': 'in total, barrientos has been married 10 times, with nine of her marriages occurring between 1999 and 2002 . she is believed to still be married to four men, and at one time, she was married to eight men at once .'}]

This summary is also very high quality.

But I decided to try the larger T5 model which is also available as a pipeline to see if the quality could improve.

t5_summarizer_larger = pipeline("summarization", model="t5-large", tokenizer="t5-large")

I was not disappointed at all.

Really impressive summary!

[{'summary_text': 'Liana barrientos has been married 10 times . nine of her marriages occurred between 1999 and 2002 . she is believed to still be married to four men, and at one time, she was married to eight men at once .'}]

Introducing Cloud Functions

Now that we have code that can effectively summarize page content, we need a simple way to expose it as an API.

In my previous article, I used Ludwig serve to do this, but as we are not using Ludwig here, we are going to a different approach: Cloud Functions.

Cloud Functions and equivalent “serverless” technologies are arguably the simplest way to get server-side code for production use.

They are called serverless because you don’t need to provision web servers or virtual machines in hosting providers.

They dramatically simplify the deployment experience as we will see.

Deploying a Hello World Cloud Function

We don’t need to leave Google Colab to deploy our first test Cloud Function.

First, login to your Google Compute account.

!gcloud auth login --no-launch-browser

Then, set up a default project.

!gcloud config set project project-name

Next, we will write our test function in a file named main.py

%%writefile main.py
def hello_get(request):

    """HTTP Cloud Function.

    Args:

        request (flask.Request): The request object.

        

    Returns:

        The response text, or any set of values that can be turned into a

        Response object using `make_response`

        .

    """

    return 'Hello World!'

We can deploy this function using this command.

!gcloud functions deploy hello_get --runtime python37 --trigger-http --allow-unauthenticated

After a few minutes, we get the details of our new API service.

availableMemoryMb: 256
entryPoint: hello_get
httpsTrigger:
url: https://xxx.cloudfunctions.net/hello_get
ingressSettings: ALLOW_ALL
labels:
deployment-tool: cli-gcloud
name: projects/xxx/locations/us-central1/functions/hello_get
runtime: python37
serviceAccountEmail: xxxx
sourceUploadUrl: xxxx
status: ACTIVE
timeout: 60s
updateTime: '2020-04-06T16:33:03.951Z'
versionId: '8'

That’s it!

We didn’t need to set up virtual machines, web server software, etc.

We can test it by opening the URL provided and the get text “Hello World!” as a response in the browser.

Deploying our Text Summarization Cloud Function

In theory, we should be able to wrap our text summarization pipeline into a function and follow the same steps to deploy an API service.

However, I had to overcome several challenges to get this to work.

Our first and most challenging problem was installing the transformers library to begin with.

Fortunately, it is simple to install third-party packages to use in Python-based Cloud Functions.

You simply need to create a standard requirements.txt file like this:

%%writefile requirements.txt

transformers==2.0.7

Unfortunately, this fails because transformers require either Pytorch or Tensorflow. They are both installed by default in Google Colab, but need to be specified for the Cloud Functions environment.

The Future of SEO: How to Safeguard Your Brand in an Unpredictable World
Join Conductor's virtual event on May 12th to hear from Conductor leaders on the future of search and the SEO strategies to protect your business.

Register For Free

ADVERTISEMENT

By default, Transformers uses Pytorch, and when I added it as a requirement, it threw up an error that led me to this useful Stack Overflow thread.

I got it to work with this updated requirement.txt file.

%%writefile requirements.txt


https://download.pytorch.org/whl/cpu/torch-1.0.1.post2-cp37-cp37m-linux_x86_64.whl
transformers==2.0.7

The next challenge was the huge memory requirements of the models and limitations of the Cloud Functions.

I first tested functions using simpler pipelines like the NER one, NER stands for Name Entity Recognition.

I test it first in the Colab notebook.

from transformers import pipeline

nlp_token_class = None

def ner_get(request):

  global nlp_token_class
  #run once
  if nlp_token_class is None:
    nlp_token_class = pipeline('ner')

  result = nlp_token_class('Hugging Face is a French company based in New-York.')

  return result

I got this breakdown response.

[{'entity': 'I-ORG', 'score': 0.9970937967300415, 'word': 'Hu'},
{'entity': 'I-ORG', 'score': 0.9345749020576477, 'word': '##gging'},
{'entity': 'I-ORG', 'score': 0.9787060022354126, 'word': 'Face'},
{'entity': 'I-MISC', 'score': 0.9981995820999146, 'word': 'French'},
{'entity': 'I-LOC', 'score': 0.9983047246932983, 'word': 'New'},
{'entity': 'I-LOC', 'score': 0.8913459181785583, 'word': '-'},
{'entity': 'I-LOC', 'score': 0.9979523420333862, 'word': 'York'}]

Then, I can simply add a %%writefile main.py to create a Python file I can use to deploy the function.

When I reviewed the logs to learn why the API calls failed, I saw the memory requirement was a big issue.

But, fortunately, you can easily override the default 250M limit and execution timeout using this command.

!gcloud functions deploy ner_get --memory 2GiB --timeout 540 --runtime python37 --trigger-http --allow-unauthenticated

I basically specify the maximum memory 2GB and 9 minutes execution timeout to allow for the initial download of the model which can be gigabytes of transfer.

One trick I’m using to speed up subsequent calls to the same Cloud Function is to cache the model downloaded in memory using a global variable and checking if it existing before recreating the pipeline.

After testing BART and T5 functions and settled with a T5-small model which fits well within the memory and timeout requirements of Cloud Functions.

Here is the code for that function.

%%writefile main.py

from transformers import pipeline

nlp_t5 = None

def t5_get(request):

  global nlp_t5

  #run once
  if nlp_t5 is None:
    nlp_t5 = pipeline('summarization', model="t5-small", tokenizer="t5-small")

  TEXT_TO_SUMMARIZE = """ 
  New York (CNN)When Liana Barrientos was 23 years old, she got married in Westchester County, New York... 
  """


  result = nlp_t5(TEXT_TO_SUMMARIZE)

  return result[0]["summary_text"]

And here is the code to deploy it.

!gcloud functions deploy t5_get --memory 2GiB --timeout 540 --runtime python37 --trigger-http --allow-unauthenticated

One problem with this function is that the text to summarize is hard coded.

But we can easily fix that with the following changes.

%%writefile main.py

from transformers import pipeline

nlp_t5 = None

def t5_post(request):

  global nlp_t5

  #run once

  if nlp_t5 is None:

    #small model to avoid memory issue

    nlp_t5 = pipeline('summarization', model="t5-small", tokenizer="t5-small")

  #Get text to summarize from POST request

  content_type = request.headers['content-type']

  if content_type == 'application/x-www-form-urlencoded':




      text = request.form.get('text')

      result = nlp_t5(text)

      return result[0]["summary_text"]

  else:

      raise ValueError("Unknown content type: {}".format(content_type))

  return "Failure"

I simply make sure the content type is form url-encoded, and read the parameter text from the form data.

I can easily test this function in Colab using the following code.

import requests
url = "https://xxx.cloudfunctions.net/hello_post"
data = {"text": text[:100]}

requests.post(url, data).text

As everything works as expected, I can work on getting this working in Google Sheets.

Calling our Text Summarization Service from Google Sheets

I introduced Apps Script in my previous column and it really gives Google Sheets super powers.

I was able to make minor changes to the function I created to make it usable for text summarization.

function getSummary(text){
  
  payload = `text=${text}`;
  
  payload = encodeURI(payload);
  
  console.log(payload);
  
  var url = "https://xxx.cloudfunctions.net/t5_post";
  
   var options = {
    "method" : "POST",
    "contentType" : "application/x-www-form-urlencoded",
    "payload" : payload,
    'muteHttpExceptions': true 
  };
        
  var response = UrlFetchApp.fetch(url, options);
  
  var result = response.getContentText();
  
  console.log(result);
  
  return result;
  
}

That’s all.

I changed the input variable name and the API URL.

The rest is the same as I need to submit a POST request with form data.

We run tests, and check the console logs to make sure that everything works as expected.

It does.

One big limitation in Apps Script is that custom functions can’t run for more than 30 seconds.

In practice, this means, I can summarize text if it is under 1,200 characters, while in Colab/Python I tested full articles with more than 10,000 characters.

An alternative approach that should work better for longer text is to update the Google Sheet from Python code as I’ve done in this article.

Scraping Page Content from Google Sheets

Here are some examples of the full working code in Sheets.

How to Produce Quality Titles & Meta Descriptions Automatically

Now that we are able to summarize text content, let’s see how we can extract it directly from public webpages.

Google Sheets includes a powerful function for this called IMPORTXML.

We just need to provide the URL and an XPath selector that identifies the content we want to extract.

Here is code to extract content from a Wikipedia page.

=IMPORTXML("https://en.wikipedia.org/wiki/Moon_landing", "//div/text()")

I used a generic selector to capture all text inside DIVs.

Feel free to play with different selectors that fit your target page content.

While we are able to get the page content, it breaks down over multiple rows. We can fix that with another function, TEXTJOIN.

=TEXTJOIN(" ", TRUE, IMPORTXML("https://en.wikipedia.org/wiki/Moon_landing", "//div/text()"))

Fast Indexing Our New Meta Descriptions and Titles in Bing

So, how do we know if these new search snippets perform better than manually written ones or compared to no meta data at all?

A surefire way to learn is by running a live test.

In such case, it is critically important to get our changes indexed fast.

I’ve covered how to do this in Google by automating the URL Inspection tool.

That approach limits us to a few hundred pages.

A better alternative is to use the awesome Bing fast indexing API because we get to request indexing for up to 10,000 URLs!

How cool is that?

As we would be primarily interested in measuring CTR, our assumption is that if we get higher CTR in Bing, the same will likely happen in Google and other search engines.

I have to agree with Roger. Here is the code to do that.

api_key = "xxx" # Get your own API key from this URL https://www.bing.com/webmaster/home/api

import requests

def submit_to_bing(request):

  global api_key

  api_url=f"https://ssl.bing.com/webmaster/api.svc/json/SubmitUrlbatch?apikey={api_key}"

  print(api_url)

  #Replace for your site 

  url_list = [
      "https://www.domain.com/page1.html",  "https://www.domain.com/page2.html"]

  data = {

    "siteUrl": "http://www.domain.com",

    "urlList": url_list

    }

  r = requests.post(api_url, json=data)

  if r.status_code == 200:

    return r.json()

  else:

    return r.status_code

If the submission is successful, you should expect this response.

{'d': None}

We can then create another Cloud Function by adding it to our main.py file and using the deploy command as before.

Testing Our Generated Snippets in Cloudflare

Finally, if your site uses the Cloudflare CDN, you could use the RankSense app to run these changes as experiments before you roll them out across the site.

Simply copy the URL and text summary columns to a new Google Sheet and import it into the tool.

How to Produce Quality Titles & Meta Descriptions Automatically

When you publish the experiment, you can opt to schedule the change and specify a webhook URL.

A webhook URL allows apps and services to talk to each other in automated workflows.

How to Produce Quality Titles & Meta Descriptions Automatically

Copy and paste the URL of the Bing indexing Cloud Function. RankSense will call it automatically 15 minutes after the changes are propagated in Cloudflare.

Print

Please login or register to post comments.

Latest Blogs

Posted: Aug 12, 2023
Categories: Uncategorized
Comments: 0
Author: Sales

Ajs Blast

Read more
Posted: May 3, 2020
Categories: SEO
Comments: 0

Web design New York

Google Suggests Links in Primary Content Hold More Value

Google’s John Mueller recently spoke on the topic of placing links and where on the page they hold the most value.

This was discussed during a Google Webmaster Central hangout on May 1 where the following question was submitted:

Read more
RSS
12345678910Last

Search