PHP

★ Three types of mocks

[AdSense-A]

Mocking, faking; these might sound like intimidating words if you don't know what they are about, but once you do, you'll be able to improve your testing skills significantly.

Part of "the art of testing" is being able to test code in some level of isolation to make sure a test suite is trustworthy and versatile. These topics are so important that we actually made five or six videos on them in our Testing Laravel course.

In this post, I want to share three ways how you can deal with mocking and faking. Let's dive in!

Laravel's Fakes

Laravel has seven fakes — eight if you count time as well:

Bus
Event
HTTP
Mail
Notification
Queue
Storage
Time

Laravel fakes are useful because they are built-in ways to disable some of the core parts of the framework during testing while still being able to make assertions on them. Here's an example of using the Storage fake to assert whether a file would have been saved in the correct place if the code was run for real, outside of your test suite:

Storage::fake('public');

$post = BlogPost::factory()->create();

Storage::disk('public')
->assertExists("blog/{$post->slug}.png");

Mockery

Laravel has built-in support for Mockery, a library that allows you to create mocks — fake implementations of a class — on the fly.

Here we create an example of an RssRepository, so that we won't perform an actual HTTP request, but instead return some dummy data:

$rss = $this->mock(
RssRepository::class,
function (MockInterface $mock) {
$mock
->shouldReceive('fetch')
->andReturn(collect([
new RssEntry(/* … */)
]));
}
);

You can imagine how using mocks can significantly impact the performance and reliability of your test suite.

Handcrafted mocks

Mockery can sometimes feel heavy or complex, depending on your use case. My personal preference is to use handcrafted mocks instead: a different implementation of an existing class, one that you register in Laravel's container when running tests. Here's an example:

class RssRepositoryFake extends RssRepository
{
public function fetch(string $url): Collection
{
return collect([
new RssEntry(/* … */),
]);
}

public static function setUp(): void
{
self::$urls = [];

app()->instance(
RssRepository::class,
new self(),
);
}
}

By cleverly using the service container, we can override our real RssRepository by one that doesn't actually perform any HTTP requests. If you're curious to learn more about them, you can check out our Testing Laravel course.

(more…)

By , ago
PHP

★ A Laravel package to crawl and index content of your sites

[AdSense-A]

The newly released spatie/laravel-site-search package can crawl and index the content of one or more sites. You can think of it as a private Google search for your sites. Like most Spatie packages, it is highly customizable: you have total control over what content gets crawled and indexed.

To see the package in action, head over to the search page of this very blog.

In this post, I'd like to introduce the package to you and highlight some implementation and testing details. Let's dig in!

Are you a visual learner?

In this stream on YouTube, I'll demo the package and dive into its source code. All questions are welcome in the chat.

Why we created this package?

In our ecosystem, there already are several options to create a search index. Let's compare them with our new package.

Laravel Scout is an excellent package to add search capabilities for Eloquent models. In most cases, this is very useful if you want to provide a structured search. For example, if you have a Product model, Scout can help build up a search index to search the properties of these products.

The main difference between Scout and laravel-site-search is that laravel-site-search is not tied to Eloquent models. Like Google, it will crawl your entire site and index all content that is there.

Another nice indexing option is Algolia Docsearch. It can add search capabilities to open-source documentation for free.

Our laravel-site-search package may be used to index non-open-source stuff as well. Where Docsearch makes basic assumptions on how the content is structured, our package tries to make a best effort to index all kinds of content.

You could also opt to use Meilisearch Doc Scraper, which you can use for non-open-source content. It's written in Python, so it's not that easy to integrate with a PHP app.

Our package is, of course, written in PHP and can be customized very easily; you can even add custom properties.

So summarised, our package can be used for all kinds of content, and it can be easily customized when installed in a Laravel app.

Crawling your first site

First, you must follow the installation instructions. This involves installing the package and installing Meilisearch. The docs even mention how you can install and run Meilisearch on a Forge provisioned server.

After you've installed the package, you can run this command to define a site that needs to be indexed.

php artisan site-search:create-index

This command will ask for a name for your index and the URL of your site that should be crawled. Of course, you could run that command multiple times to create multiple indexes.

After that, you should run this command to start a queued job. You should probably schedule that command to run every couple of hours so that the index is kept in sync with your site's latest content.

php artisan site-search:crawl

That job that is started by that command will:

create a new Meilisearch index
crawl your website using multiple concurrent connections to improve performance
transform crawled content to something that can be put in the search index
mark that new Meilisearch index as the active one
delete the old Meilisearch index

Finally, you can use Search to perform a query on your index.

use SpatieSiteSearchSearch;

$searchResults = Search::onIndex($indexName)
->query('your query')
->get();

This is how you could render the results in a Blade view:

<ul>
@foreach($searchResults->hits as $hit)
<li>
<a href="{{ $hit->url }}">
<div>{{ $hit->url }}</div>
<div>{{ $hit->title() }}</div>
<div>{!! $hit->highlightedSnippet() !!}</div>
</a>
</li>
@endforeach
</ul>

That is basically how you can use the package. On the search page of this very blog, you can see the package in action. I've also open-sourced my blog, so on GitHub, you'll be able to see the Livewire component and Blade view that power the search page.

Customizing what gets crawled and indexed

In most cases, you don't want to index all content that is available on your site. A few examples of this are menu structures or list pages (e.g. a list with blog posts with links to the detail pages of those posts).

We've made it easy to ignore such content. In the config file there's an option ignore_content_on_urls. Your homepage probably contains no unique content but rather links to pages where the full content is.

You can ignore the content on the homepage by adding /. We'll still crawl the homepage but not put any of its content in the index.

/*
* When crawling your site, we will ignore content that is on these URLs.
*
* All links on these URLs will still be followed and crawled.
*
* You may use `*` as a wildcard.
*/
'ignore_content_on_urls' => [
'/'
],

You can also ignore content based on CSS selectors. There's an option ignore_content_by_css_selector in the config file that lets you specify any CSS selection.

If your menu structure is in a nav element, you can add nav. You could also introduce a data attribute that you could slap on any content you don't want in your index.

So with this configuration:

/*
* When indexing your site, we will ignore any content to the search index
* that is selected by these CSS selectors.
*
* All links inside such content will still be crawled, so it's safe
* it's safe to add a selector for your menu structure.
*/
'ignore_content_by_css_selector' => [
'nav',
'[data-no-index]',
],

... this div won't get indexed:

<div>
This will get indexed
</div>
<div data-no-index>
This won't get indexed but <a href="/other-page">this link</a> will still be followed.
</div>

Using a search profile

For a lot of users, the above config options will be enough. If you want to control what gets indexed and crawled programmatically, you can use a search profile.

A search profile determines which pages get crawled and what content gets indexed. In the site-search config file, you'll win the default_profile key that the SpatieSiteSearchProfilesDefaultSearchProfile::class is being use by default.

This default profile will instruct the indexing process:

to crawl each page of your site
to only index any page that had 200 as the status code of its response
to not index a page if the response had a header site-search-do-not-index

By default, the crawling process will respect the robots.txt of your site.

If you want to customize the crawling and indexing behaviour, you could opt to extend SpatieSiteSearchProfilesDefaultSearchProfile or create your own class that implements the SpatieSiteSearchProfilesSearchProfile interface. This is how that interface looks like.

namespace SpatieSiteSearchProfiles;

use PsrHttpMessageResponseInterface;
use PsrHttpMessageUriInterface;
use SpatieSiteSearchIndexersIndexer;

interface SearchProfile
{
public function shouldCrawl(UriInterface $url, ResponseInterface $response): bool;
public function shouldIndex(UriInterface $url, ResponseInterface $response): bool;
public function useIndexer(UriInterface $url, ResponseInterface $response): ?Indexer;
public function configureCrawler(Crawler $crawler): void;
}

Indexing extra properties

Only the page title, URL, description, and some content are added to the search index by default. However, you can add any extra property you want.

You do this by using a custom indexer and override the extra method.

class YourIndexer extends SpatieSiteSearchIndexersDefaultIndexer
{
public function extra() : array{
return [
'authorName' => $this->functionThatExtractsAuthorName()
]
}

public function functionThatExtractsAuthorName()
{
// add logic here to extract the username using
// the `$response` property that's set on this class
}
}

The extra properties will be available on a search result hit.

$searchResults = SearchIndexQuery::onIndex('my-index')->search('your query')->get();

$firstHit = $searchResults->hits->first();

$firstHit->authorName; // returns the author name

Let's take a look at the tests

When writing tests, I usually prefer to write feature tests. They give me the highest confidence that everything is working correctly.

In the case of this package, a proper feature test would encompass crawling and indexing a site, then perform a query to the built-up search index, and verify if the results are correct.

In our test suite, we do precisely that. Let's first take a look at the test itself.

it('can crawl and index all pages', function () {
Server::activateRoutes('chain');

dispatch(new CrawlSiteJob($this->siteSearchConfig));

waitForMeilisearch($this->siteSearchConfig);

$searchResults = Search::onIndex($this->siteSearchConfig->name)
->query('here')
->get();

expect(hitUrls($searchResults))->toEqual([
'http://localhost:8181/',
'http://localhost:8181/2',
'http://localhost:8181/3',
]);
});

The site that we're going to crawl is not a real site. The used crawl_url in $this->siteSearchConfig is set to localhost:8181. This site is served by a Lumen application, that is booted whenever the tests run.

The first line of our test is Server::activateRoutes('chain'). This will make our Lumen application load and use a certain routes file. In this case, we will let our Lumen app use the chain.php routes file. This is what that routes file looks like:

$router->get('/', fn () => view('chain/1'));
$router->get('2', fn () => view('chain/2'));
$router->get('3', fn () => view('chain/3'));

So basically, our Lumen app now is a mini-site that serves a couple of chained pages.

In the following lines of our test, we're dispatching the job that will crawl and indexed that site.


// in our test

dispatch(new CrawlSiteJob($this->siteSearchConfig));

waitForMeilisearch($this->siteSearchConfig);

That waitForMeilisearch also deserves a bit of explanation. When something is being saved in a Meilisearch index, that bit of info won't be indexed immediately. Meilisearch needs a bit of time to process everything. Our tests need to wait on that because otherwise, our test may randomly fail because sometimes our exceptions would run before the indexing is complete.

Luckily, Meilisearch has an API that can determine whether all updates to an index are processed. Here's the implementation of waitForMeilisearch. We simply wait for Meilisearch's processing to be done.

function waitForMeilisearch(SiteSearchConfig $siteSearchConfig): void
{
$indexName = $siteSearchConfig->refresh()->index_name;

while (MeiliSearchDriver::make($siteSearchConfig)->isProcessing($indexName)) {
sleep(1);
}
}

After Meilisearch has done its work, we will perform a query against the Meilisearch index and expect certain URLs to be returned.

// in our test

$searchResults = Search::onIndex($this->siteSearchConfig->name)
->query('here')
->get();

expect(hitUrls($searchResults))->toEqual([
'http://localhost:8181/',
'http://localhost:8181/2',
'http://localhost:8181/3',
]);

With that Lumen test server a waitForMeilisearch function, we can test most functionalities of the package. Here's the test that makes sure the ignore_content_on_urls option is working.

When crawling the same chain as above but add ignore_content_on_urls to the pages to ignore, we expect that / and /3 are in the index.

it('can be configured not to index certain urls', function () {
Server::activateRoutes('chain');

config()->set('site-search.ignore_content_on_urls', [
'/2',
]);

dispatch(new CrawlSiteJob($this->siteSearchConfig));

waitForMeilisearch($this->siteSearchConfig);

$searchResults = Search::onIndex($this->siteSearchConfig->name)
->query('here')
->get();

expect(hitUrls($searchResults))->toEqual([
'http://localhost:8181/',
'http://localhost:8181/3',
]);
});

This kind of test gives me a lot of confidence that everything in the package is working correctly. If you want to see more tests, head over to the test suite on GitHub.

In closing

I hope that you like this little tour of the package. There are a lot of options not mentioned in this blog post: You can create synonyms, extra properties can be added, and much more.

We spent a lot of time making every aspect of the crawling and indexing behaviour customizable. Discover all the options in our extensive docs.

This isn't the first package that our team has made. Our website has an open source section that lists every package that our team has made. I'm pretty sure that there's something there for your next project. Currently, all of our package combined are being downloaded 10 million times a month.

Our team doesn't only create open-source, but also paid digital products, such as Ray, Mailcoach and Flare. Our team also creates premium video courses, such as Laravel Beyond CRUD, Testing Laravel, Laravel Package Training and Event Sourcing in Laravel. If you want to support our open source efforts, consider picking up one of our paid products.

(more…)

By , ago
PHP

★ Replacing Keytar with Electron's safeStorage in Ray

[AdSense-A]

Ray is an app we built at Spatie to make debugging your applications easier and faster. Being web developers, we naturally decided to write this app in Electron, which enabled us to move from nothing to a working prototype to a released product on 3 separate platforms within a matter of weeks.

About 9 months ago, Alex added a much requested feature that allows you to connect to remote servers and receive their Ray outputs securely over SSH.

To save the credentials to a server, we needed to find a secure way to save the password or private key passphrase. We quickly settled on node-keytar, a native Node.js module that leverages your system's keychain (Keychain/libsecret/Credential Vault/…) to safely store passwords, hidden from other applications and users.

Using a native Node module brought one major disadvantage: we couldn't easily build for other platforms anymore without running the actual build on that platform. There are some solutions provided by electron-builder, Keytar, and other packages, but these all came with their own layer of overhead. We eventually decided to run the build on all platforms separately using the GitHub Actions CI.

Three weeks ago, on September 21, 2021, Electron 15 was released, and somewhat hidden in the release notes we found a mention to a newly added string encryption API: safeStorage (PR/docs). Similarly to Keytar, Electron's safeStorage also uses the system's keychain to securely encrypt strings, but without the need for an extra dependency.

We jumped at the idea of simplifying our build process and removing a dependency, and wrote this simple implementation using safeStorage and electron-store, with an external API inspired by Keytar:

import { safeStorage } from 'electron';
import Store from 'electron-store';

const store = new Store<Record<string, string>>({
name: 'ray-encrypted',
watch: true,
encryptionKey: 'this_only_obfuscates',
});

export default {
setPassword(key: string, password: string) {
const buffer = safeStorage.encryptString(password);
store.set(key, buffer.toString('latin1'));
},

deletePassword(key: string) {
store.delete(key);
},

getCredentials(): Array<{ account: string; password: string }> {
return Object.entries(store.store).reduce((credentials, [account, buffer]) => {
return [...credentials, { account, password: safeStorage.decryptString(Buffer.from(buffer, 'latin1')) }];
}, [] as Array<{ account: string; password: string }>);
},
};

The only downside is that any previously saved passwords and passphrases saved using Keytar are now inaccessible for the app, but you can still find them by opening your system's keychain application and looking for any mentions of ray, ssh_password_, or private_key_. We also hope Ray wasn't the only spot you saved your server passwords.

Upon opening the servers overlay, existing users will receive a notification that their credentials need to be entered again. Ray will save which servers have not had their credentials updated yet, and will display this to the user. We used electron-store's migrations feature for this:

migrations: {
'>=1.18.0': (store) => {
store.set(
'servers',
store.get('servers').map((server) => ({ ...server, needsCredentialsUpdate: true }))
);
},
},
(more…)

By , ago
PHP

★ Making 1Password understand where your change password page is located

[AdSense-A]

A few days ago, a new version of 1Password was released that is able to detect where a user can reset his or her password.

This is how it looks like in 1Password:

When you click that "Change password" item, 1Password will open up a tab in your browser on the right page at Oh Dear to change the password.

This is pretty convenient if you ask me.

1Password knows where the change password page is located using the "Well-Known URL for Changing Passwords" specification. This specification tells that a request to <your-domain>/.well-known/change-password should redirect to the change password pass on your site.

So, behind the scenes, 1Password simply requests /.well-known/change-password and checks if a redirect is made.

In Laravel, you can easily create such a redirect in a routes file.

Route::redirect('/.well-known/change-password', '/url-of-your-change-password-page)

I was tempted to use a closure and the route name, but this code would make the routes uncacheable.

// do not use this if you want to use route caching
Route::get('.well-known/change-password', fn() => redirect()->route('profile.show'));

I can highly recommend adding a /.well-known/change-password redirect to your projects.

(more…)

By , ago