The code
Using a combination of the PHP functions strpos() and substr() we can extract the first sentence from the above text like so by looking for the location of the first period / full stop in the content and returning everything up to and including it.
function first_sentence($content) {
$pos = strpos($content, '.');
return substr($content, 0, $pos+1);
}
Then doing this:
echo first_sentence($content);
would output this:
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
What if there’s no periods / full stops?
The first example assumes that would be at least one period / full stop in the content. If there isn’t, the example code will simply return the first letter from the passed in string.
This isn’t ideal, so we can modify the first_sentence() function to use strpos() to check for a full stop, and if there isn’t one just return the whole string instead:
function first_sentence($content) {
$pos = strpos($content, '.');
if($pos === false) {
return $content;
}
else {
return substr($content, 0, $pos+1);
}
}
Automatically removing HTML code
And finally, we’ll modify the code to remove any HTML tags and entities. If the source content is always plain text then you won’t need to do this step, but if it can then you’ll need to clean it up first.
You may not need to use the html_entity_decode part (which converts e.g. & to &) but you will need to strip the tags, otherwise in <p>blah blah blah.</p> you’d end up with <p>blah blah blah. without the closing </p> tag. Also it’s possible your HTML tags may contain . characters which would falsely indicate the end of the sentence.
function first_sentence($content) {
$content = html_entity_decode(strip_tags($content));
$pos = strpos($content, '.');
if($pos === false) {
return $content;
}
else {
return substr($content, 0, $pos+1);
}
}
Conclusion
It’s easy to extract the first sentence from some content using the PHP functions strpos() and substr() by looking for the first occurence of a period or full stop. The final example function in this post combines this with a fallback in case the content does not contain a full stop and cleans the content from HTML tags and entities.
0 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.
You must be logged in to post a comment.