ChatGPT解决这个技术问题 Extra ChatGPT

How to validate an email address in PHP

I have this function to validate an email addresses:

function validateEMAIL($EMAIL) {
    $v = "/[a-zA-Z0-9_-.+]+@[a-zA-Z0-9-]+.[a-zA-Z]+/";

    return (bool)preg_match($v, $EMAIL);
}

Is this okay for checking if the email address is valid or not?

If it works it works. You can't really make it better, it's too small. Only thing that's not good is style. validateEmail would be corret, as well as passing $email, not $EMAIL.
Just wanted to make sure I didn't have any major problems in the code that's all :)
See also stackoverflow.com/questions/201323/… for more about how and how not to use regular expressions to validate email addresses.
That would fail to validate many valid email addresses. For example *@example.com or '@example.com or me@[127.0.0.1] or you@[ipv6:08B0:1123:AAAA::1234]
@jcoder, not that I'm recommending that regex, but at least we can hope anyone using such addresses for sing up etc wouldn't complain when it fails :)

e
emkey08

The easiest and safest way to check whether an email address is well-formed is to use the filter_var() function:

if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
    // invalid emailaddress
}

Additionally you can check whether the domain defines an MX record:

if (!checkdnsrr($domain, 'MX')) {
    // domain is not valid
}

But this still doesn't guarantee that the mail exists. The only way to find that out is by sending a confirmation mail.

Now that you have your easy answer feel free to read on about email address validation if you care to learn or otherwise just use the fast answer and move on. No hard feelings.

Trying to validate an email address using a regex is an "impossible" task. I would go as far as to say that that regex you have made is useless. There are three rfc's regarding emailaddresses and writing a regex to catch wrong emailadresses and at the same time don't have false positives is something no mortal can do. Check out this list for tests (both failed and succeeded) of the regex used by PHP's filter_var() function.

Even the built-in PHP functions, email clients or servers don't get it right. Still in most cases filter_var is the best option.

If you want to know which regex pattern PHP (currently) uses to validate email addresses see the PHP source.

If you want to learn more about email addresses I suggest you to start reading the specs, but I have to warn you it is not an easy read by any stretch:

rfc5322

rfc5321

rfc3696

rfc6531 (allows unicode characters, although many clients / servers don't accept it)


It does not work for all emailaddresses as stated. Also see the list of failed tests in my answer to see that some quoted strings do work and others not.
Nope, too many failed tests on that pattern emailtester.pieterhordijk.com/test-pattern/MTAz :-)
This pattern is extremely complex in case you need to use it with function like "preg_match_all" over big text string with emails inside. If any of you has simpler please share. I mean if you want to: preg_match_all($pattern, $text_string, $matches); then this complex pattern will overload the server if you need to parse really big text.
What is your point @sergio? Again as stated @sergio there are several "failures".
@PeeHaa: Postfix 3.0 supports it for almost two years now: postfix.org/SMTPUTF8_README.html , and it is included in Ubuntu 16.04 and will be included in the next Debian release, for example. Exim has experimental support. Webmail providers like Gmail have also added support for sending/receiving such emails, although you cannot yet create unicode accounts. Widespread use and support is within reach, and filter_var will lag behind by quite some time, even if they change it right now (I have posted a bug report).
A
Alex

You can use filter_var for this.

<?php
   function validateEmail($email) {
      return filter_var($email, FILTER_VALIDATE_EMAIL);
   }
?>

stop adding this function as this does not validate domains. if you are adding some@address this is valid. and it's not!
What's with all the one line functions containing one line functions? I am seeing them everywhere. When did this become a "thing"? (rhetorical). This needs to stop.
@user2607743 I think it makes sense, if you, one year later with 100 usages of it in your project and you want to improve the way you validate the emails.... then it's going to be faster to edit 1 function than a hundred places.
@HerrNentu' whats wrong with some@address? It is a perfectly valid email address. Like root@localhost is one. You are just doing the wrong thing. You are syntactically validating the form of the email address, and some@address is valid according to the RFC. But what you want to do is validate that an address is reachable. some@address is only reachable if the host address is known in your network. To validate reachability, you can check the DNS (check the host exists) or use SMTP (check the mailbox exists).
@ChristopherK. the problem is that it validates the email address without a domain.
J
Jabari

In my experience, regex solutions have too many false positives and filter_var() solutions have false negatives (especially with all of the newer TLDs).

Instead, it's better to make sure the address has all of the required parts of an email address (user, "@" symbol, and domain), then verify that the domain itself exists.

There is no way to determine (server side) if an email user exists for an external domain.

This is a method I created in a Utility class:

public static function validateEmail($email)
{
    // SET INITIAL RETURN VARIABLES

        $emailIsValid = FALSE;

    // MAKE SURE AN EMPTY STRING WASN'T PASSED

        if (!empty($email))
        {
            // GET EMAIL PARTS

                $domain = ltrim(stristr($email, '@'), '@') . '.';
                $user   = stristr($email, '@', TRUE);

            // VALIDATE EMAIL ADDRESS

                if
                (
                    !empty($user) &&
                    !empty($domain) &&
                    checkdnsrr($domain)
                )
                {$emailIsValid = TRUE;}
        }

    // RETURN RESULT

        return $emailIsValid;
}

Neverbounce claims their API is able to validate to 97% delivery. As long as you don't mind handing over your contacts database, of course.
stristr will fail to get the domain if there are multiple @ signs. Better to explode('@',$email) and check that sizeof($array)==2
@AaronGillion While you are correct as far as a better way to get domain parts, the method would still return false as checkdnsrr() would return false if there were an @ sign in the domain.
F
Fluffeh

I think you might be better off using PHP's inbuilt filters - in this particular case:

It can return a true or false when supplied with the FILTER_VALIDATE_EMAIL param.


s
smottt

This will not only validate your email, but also sanitize it for unexpected characters:

$email  = $_POST['email'];
$emailB = filter_var($email, FILTER_SANITIZE_EMAIL);

if (filter_var($emailB, FILTER_VALIDATE_EMAIL) === false ||
    $emailB != $email
) {
    echo "This email adress isn't valid!";
    exit(0);
}

It considered error`@gmail.com as valid email. Note that it contains `.
P
Pelmered

After reading the answers here, this is what I ended up with:

public static function isValidEmail(string $email) : bool
{
    if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
        return false;
    }

    //Get host name from email and check if it is valid
    $email_host = array_slice(explode("@", $email), -1)[0];

    // Check if valid IP (v4 or v6). If it is we can't do a DNS lookup
    if (!filter_var($email_host,FILTER_VALIDATE_IP, [
        'flags' => FILTER_FLAG_NO_PRIV_RANGE | FILTER_FLAG_NO_RES_RANGE,
    ])) {
        //Add a dot to the end of the host name to make a fully qualified domain name
        // and get last array element because an escaped @ is allowed in the local part (RFC 5322)
        // Then convert to ascii (http://us.php.net/manual/en/function.idn-to-ascii.php)
        $email_host = idn_to_ascii($email_host.'.');

        //Check for MX pointers in DNS (if there are no MX pointers the domain cannot receive emails)
        if (!checkdnsrr($email_host, "MX")) {
            return false;
        }
    }

    return true;
}

Is there any reason for the array_slice? Why don't you just use explode("@", $email)[1]? Can @ characters appear in the user part of the email address?
@User1337 I think it was for backwards compatibility. Accessing the return type directly like that is not supported before PHP 5.4 (I think). However, that is a pretty old and unsupported version by now so I would probably do as you suggest.
I just tested it, and you are actually right. From the perspective of someone who started coding a couple of years ago, it's unbelievable what programmers had to deal with to achieve the simplest things.
An MX entry is not necessary to receive emails. If none is present, the A entry will be used. See serverfault.com/questions/470649/…
@ChristopherK. Oh, that was interesting. I have used a check like this in various projects and have probably validated over a million email addresses, and this has never been a problem. I think it's a pretty good check to make to make sure the domain is actually pointed somewhere. Maybe a fallback check for an A pointer could be used, but that might do more harm than good even it seems like a more correct check.
M
Mostafa Norzade

Use below code:

// Variable to check
$email = "john.doe@example.com";

// Remove all illegal characters from email
$email = filter_var($email, FILTER_SANITIZE_EMAIL);


// Validate e-mail
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
  echo("Email is a valid email address");
}

In most cases, you probably don't want to strip illegal characters like that when validating. If you check an email address with illegal characters, that should not validate.
C
Community

Answered this in 'top question' about emails verification https://stackoverflow.com/a/41129750/1848217

For me the right way for checking emails is: Check that symbol @ exists, and before and after it there are some non-@ symbols: /^[^@]+@[^@]+$/ Try to send an email to this address with some "activation code". When the user "activated" his email address, we will see that all is right. Of course, you can show some warning or tooltip in front-end when user typed "strange" email to help him to avoid common mistakes, like no dot in domain part or spaces in name without quoting and so on. But you must accept the address "hello@world" if user really want it. Also, you must remember that email address standard was and can evolute, so you can't just type some "standard-valid" regexp once and for all times. And you must remember that some concrete internet servers can fail some details of common standard and in fact work with own "modified standard".

So, just check @, hint user on frontend and send verification emails on given address.


Your regex does check for @, but it doesn't really check that it's valid per any of the RFCs that govern email. It also doesn't work as written. I ran it through regex101.com and it failed to match valid addresses
Do you read only regex or the whole answer? Fully disagree with you. Just say me please, according what RFC the gmail.com server assumes that joe@gmail.com and jo.e@gmail.com is the same address? There are lot of servers which works not by standards or not by FRESH standards. But thay serve emails of their users. If you type some regexp once, and validate only by that, you have no guarantee that it will stay right in future and your future users will not fail with their "new-way" emails. So, my position is the same: main point if you want to verify email address - just send activation email.
@Machavity but thanks for bugreport in regexp, i fixed it from /^[^@]+@[^@+]$/ to /^[^@]+@[^@]+$/
Props to you for fixing the regex, but how does that improve over the filter_var method? It doesn't fix the problem of it accepting badly formatted addresses either. Your regex will happily accept joe@domain as a valid email address, when it's not
@Machavity, well, for example, there's an concrete version of PHP on your server and you can't update it to newest. For example, you have php 5.5.15 . In 2018 standard of valid emails was extended. It will realized in php 7.3.10 soon. And there'll good-working function filter_var($email, FILTER_VALIDATE_EMAIL, $newOptions). But you have old function on server, you cant update in some cases. And you will loose clients with some new valid emails. Also, once more I notice, that not all email-serving severs works strictly accordingly to common and modern standard of email adresses.
B
Bud Damyanov

If you want to check if provided domain from email address is valid, use something like:

/*
* Check for valid MX record for given email domain
*/
if(!function_exists('check_email_domain')){
    function check_email_domain($email) {
        //Get host name from email and check if it is valid
        $email_host = explode("@", $email);     
        //Add a dot to the end of the host name to make a fully qualified domain name and get last array element because an escaped @ is allowed in the local part (RFC 5322)
        $host = end($email_host) . "."; 
        //Convert to ascii (http://us.php.net/manual/en/function.idn-to-ascii.php)
        return checkdnsrr(idn_to_ascii($host), "MX"); //(bool)       
    }
}

This is handy way to filter a lot of invalid email addresses, along with standart email validation, because valid email format does not mean valid email.

Note that idn_to_ascii() (or his sister function idn_to_utf8()) function may not be available in your PHP installation, it requires extensions PECL intl >= 1.0.2 and PECL idn >= 0.1.

Also keep in mind that IPv4 or IPv6 as domain part in email (for example user@[IPv6:2001:db8::1]) cannot be validated, only named hosts can.

See more here.


I don't think it will work if the host portion of the email address is in IP address in IPv6 format
An MX entry is not necessary to receive emails. If none is present, the A entry will be used. See serverfault.com/questions/470649/…
s
smulholland2

If you're just looking for an actual regex that allows for various dots, underscores and dashes, it as follows: [a-zA-z0-9.-]+\@[a-zA-z0-9.-]+.[a-zA-Z]+. That will allow a fairly stupid looking email like tom_anderson.1-neo@my-mail_matrix.com to be validated.


T
Thielicious
/(?![[:alnum:]]|@|-|_|\.)./

Nowadays, if you use a HTML5 form with type=email then you're already by 80% safe since browser engines have their own validator. To complement it, add this regex to your preg_match_all() and negate it:

if (!preg_match_all("/(?![[:alnum:]]|@|-|_|\.)./",$email)) { .. }

Find the regex used by HTML5 forms for validation
https://regex101.com/r/mPEKmy/1


I hate downvotes too w/o explanation. Well I guess he might say: Browser email check (client side) is not secure at all. Anyone can send anything to a server by changing the code. So it's obvious and the most secure way to do the check (again) server side. The question here is based on PHP, so its obvious Cameron was looking for a server solution and not for a client solution.
This answer may not fully PHP related, but is HTML suggestion covers the "standard" user using just a phone/PC. Also the user gets an info directly in "his" browser while using the site. Real checks on server side are not covered with this, sure. Btw, @Thielicious mentioned a PHP change, so his comment is related IMHO.
It probably received down votes due the the assumption that you're "80% safe since browser engines have their own validator". There are many other ways to send http requests than through a browser, so you can't assume that any request is safe...even if you check the browser agent.
S
Stephen

theres is a better regex built in FILTER_VALIDATE_EMAIL but any regex can give bad results.

For example..

// "not an email" is invalid so its false.
php > var_export(filter_var("not an email", FILTER_VALIDATE_EMAIL));
false
// "foo@a.com" looks like an email, so it passes even though its not real.
php > var_export(filter_var("foo@a.com", FILTER_VALIDATE_EMAIL));
'foo@a.com'
// "foo@gmail.com" passes, gmail is a valid email server,
//  but gmail require more than 3 letters for the address.
var_export(filter_var("foo@gmail.com", FILTER_VALIDATE_EMAIL));
'foo@gmail.com'

You might want to consider using an API like Real Email which can does in depth mailbox inspections to check if the email is real.

A bit like ..

$email = "foo@bar.com";
$api_key = ???;

$request_context = stream_context_create(array(
    'http' => array(
        'header'  => "Authorization: Bearer " . $api_key
    )
));

$result_json = file_get_contents("https://isitarealemail.com/api/email/validate?email=" . $email, false, $request_context);

if (json_decode($result_json, true)['status'] == "valid") {
    echo("email is valid");
} else if (json_decode($result_json, true)['status'] == "invalid") {
    echo("email is invalid");
} else {
  echo("email was unknown");
}

T
ThinkTrans

There are three RFCs that lay down the foundation for the "Internet Message Format".

RFC 822 RFC 2822 (Supersedes RFC 822) RFC 5322 (Supersedes RFC 2822)

The RFC 5322, however, defines the e-mail IDs and their naming structure in the most technical manner. That is more suitable laying down the foundation an Internet Standard that liberal enough to allow all the use-cases yet, conservative enough to bind it in some formalism.

However, the e-mail validation requirement from the software developer community, has the following needs -

to stave off unwanted spammers

to ensure the user does not make inadvertent mistake

to ensure that the e-mail ID belongs to the actual person inputting it

They are not exactly interested in implementing a technically all-encompassing definition that allows all the forms (IP addresses, including port IDs and all) of e-mail id. The solution suitable for their use-case is expected to solely ensure that all the legitimate e-mail holders should be able to get through. The definition of "legitimate" differs vastly from technical stand-point (RFC 5322 way) to usability stand-point(this solution). The usability aspect of the validation aims to ensure that all the e-mail IDs validated by the validation mechanism belong to actual people, using them for their communication purposes. This, thus introduces another angle to the validation process, ensuring an actually "in-use" e-mail ID, a requirement for which RFC-5322 definition is clearly not sufficient.

Thus, on practical grounds, the actual requirements boil down to this -

To ensure some very basic validation checks To ensure that the inputted e-mail is in use

Second requirement typically involves, sending a standard response seeking e-mail to the inputted e-mail ID and authenticating the user based on the action delineated in the response mechanism. This is the most widely used mechanism to ensure the second requirement of validating an "in use" e-mail ID. This does involve round-tripping from the back-end server implementation and is not a straight-forward single-screen implementaion, however, one cannot do away with this.

The first requirement, stems from the need that the developers do not want totally "non e-mail like" strings to pass as an e-mail. This typically involves blanks, strings without "@" sign or without a domain name. Given the punycode representations of the domain names, if one needs to enable domain validation, they need to engage in full-fledged implementation that ensures a valid domain name. Thus, given the basic nature of requirement in this regard, validating for "@." is the only apt way of satisfying the requirement.

A typical regex that can satisfy this requirement is: ^[^@\s]+@[^@\s.]+.[^@\s.]+$ The above regex, follows the standard Perl regular-expression standard, widely followed by majority of the programming languages. The validation statement is: @.

For those who want to go one step deeper into the more relevant implementations, they can follow the following validation methodology. @

For <e-mail local part> - Follow the guidelines by the "Universal Acceptance Steering Group" - UASG-026 For <domain name>, you can follow any domain validation methodology using standard libraries, depending on your programming language. For the recent studies on the subject, follow the document UASG-018A.

Those who are interested to know the overall process, challenges and issues one may come across while implementing the Internationalized Email Solution, they can also go through the following RFCs:

RFC 6530 (Overview and Framework for Internationalized Email) RFC 6531 (SMTP Extension for Internationalized Email) RFC 6532 (Internationalized Email Headers) RFC 6533 (Internationalized Delivery Status and Disposition Notifications) RFC 6855 (IMAP Support for UTF-8) RFC 6856 (Post Office Protocol Version 3 (POP3) Support for UTF-8) RFC 6857 (Post-Delivery Message Downgrading for Internationalized Email Messages) RFC 6858 (Simplified POP and IMAP Downgrading for Internationalized Email).


T
ThinkTrans

The question title is fairly generic, however the body of the question indicates that it is about the PHP based solution. Will try to address both.

Generically speaking, for all programming languages: Typically, validating" an e-mail address with a reg-ex is something that any internet based service provider should desist from. The possibilities of kinds of domain names and e-mail addresses have increased so much in terms of variety, any attempt at validation, which is not well thought may end up denying some valid users into your system. To avoid this, one of the best ways is to send an email to the user and verify it being received. The good folks at "Universal Acceptance Steering Group" have compiled a languagewise list of libraries which are found to be compliant/non-compliant with various parameters involving validations vis-a-vis Internationalized Domain Names and Internationalized Email addresses. Please find the links to those documents over here and here.

Speaking specifically of PHP: There is one good library available in PHP i.e. EmailValidator. It is an email address validator that includes many validation methods such as DNS validation. The validator specifically recommended is called RFCValidator and validates email addresses against several RFCs. It has good compliance when it comes to being inclusive towards IDNs and Internationalized Email addresses.


u
user1444314

I've made Python & PHP implementations of properly verify ANY email address, that is confirmed as real one from the mailserver for the domain that is real.

Released under GPL-3.0 lisence.

There you go:

https://lja.fi/index.php/github-stuff/

--lja


G
Gray Programmerz

I have prepared a function that checks email validity:

function isValidEmail($email)
{
    $re = '/([\w\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)/m';
    preg_match_all($re, $email, $matches, PREG_SET_ORDER, 0);
    if(count($matches) > 0) return $matches[0][0] === $email;
    return false;
}

The problem with FILTER_VALIDATE_EMAIL is that it considers even invalid emails as valid.

Following are example:

if(isValidEmail("foo@gmail.com")) echo "valid";
if(!isValidEmail("fo^o@gmail.com")) echo "invalid";